International Journal of Approximate Reasoning 54 (2013) 861–875
Contents lists available at SciVerse ScienceDirect
International Journal of Approximate Reasoning journal homepage: www.elsevier.com/locate/ijar
Unifying ontological similarity measures: A theoretical and empirical investigation Valerie Cross a,∗ , Xinran Yu b , Xueheng Hu c a
Computer Science and Software Engineering Department, Miami University, Oxford, OH 45056, USA Department of Computer Science, University of Texas, San Antonio, TX 78249, USA c Department of Computer Science, University of Notre Dame, Notre Dame, IN 46556, USA b
ARTICLE INFO
ABSTRACT
Article history: Available online 31 March 2013
This paper theoretically and empirically investigates ontological similarity. Tversky’s parameterized ratio model of similarity (Tversky, 1977) [3] is shown as a unifying basis for many of the well-known ontological similarity measures. A new family of ontological similarity measures is proposed that allows parameterizing the characteristic set used to represent an ontological concept. The three subontologies of the prominent Gene Ontology (GO) are used in an empirical investigation of several ontological similarity measures. Another study using well known semantic similarity within two different anatomy ontologies, the NCIT anatomy and the mouse anatomy, is also presented for comparison to several of the GO results. A discussion of the correlation among the measures is presented as well as a comparison of the effects of two different methods of determining a concept’s information content, corpusbased and ontology-based. © 2013 Elsevier Inc. All rights reserved.
Keywords: Semantic similarity Ontological similarity Information content Gene Ontology NCIT human anatomy Mouse anatomy
1. Introduction In ontologies, similarity measurement is needed to determine how similar one concept is to another. An ontological similarity measure [1] is a semantic similarity measure specific to assessing similarity between concepts within an ontology. Such measures have seen a proliferation in the last several years, particularly in the biomedical and bioinformatics area [2]. Just recently more new ontological similarity measures have been proposed based on intuitive combinations of informationtheoretic measures with Tversky’s model of similarity [3]. In [4] the contrast model is used and then modified in [5] to use the parameterized ratio model of similarity. These recently proposed intuitive models are not the first examination of integrating the information-theoretic model and the Tversky models of similarity [6,7]. As new ontological similarity measures are proposed, evaluations of them against existing ones have typically used one of three approaches: mathematical analysis, domain-specific applications of them, and comparison of them to human judgments of similarity [8]. By far, the primary approach, however, has been to compare them to human similarity judgments. More recently, due to application of ontological similarity within the Gene Ontology (GO) [9] to determine gene product similarity, other physical similarity measures such as sequence similarity [10] are being used for performance comparisons. Some efforts have been made on mathematical analysis of ontological similarity measures [6,11,12]. This paper theoretically and empirically investigates ontological similarity and is developed further based on the initial work in [13]. Tversky’s parameterized ratio model of similarity [3] is shown as a unifying basis of many of the well-known ontological similarity measures. This unification framework relies on principles from both information theory and fuzzy set theory. Information theory is used in determining a measure of the information content (IC) of a concept within an ontology. Two kinds of information content measures typically used in ontological similarity measures, corpus-based and ∗ Corresponding author. E-mail addresses:
[email protected] (V. Cross),
[email protected] (X. Yu),
[email protected] (X. Hu). 0888-613X/$ - see front matter © 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ijar.2013.03.003
862
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
ontology-based are presented and compared with respect to their use in ontological similarity measures. A new family of ontological similarity measures is proposed that allows parameterization of the characteristic set used to represent an ontological concept. The three subontologies of the prominent GO are used in an empirical investigation of several ontological similarity measures as well as a comparison of corpus-based and ontology-based IC. Results from experiments with two other real world ontologies, the NCIT human anatomy and the mouse anatomy are also reported. A new ontological similarity measure derived from the proposed family of ontological similarity is also empirically studied. The correlation among the measures is presented and their performances with respect to the different ontologies are compared. Section 2 briefly introduces similarity assessment and the key role of Tversky’s two models of similarity. Section 3 presents the components of the framework for ontological similarity. The key idea is that ontological concepts can be represented as fuzzy sets using information content measures and that fuzzy set similarity measures based on Tversky’s parameterized ratio model are applicable to measuring the similarity between ontological concepts. The theoretical investigation shows how these various components can be used to create many of the existing ontological similarity measures. Section 4 describes two separate experiments using different bioinformatics and biomedical ontologies and reports their results. The first one uses the GO to compare the performance of two different kinds of information content and then five different IC based ontological similarity measures: two historical measures, two recently proposed intuitive measures using Tversky’s models, and a new ontological similarity measure developed using the framework described in Section 3. The second experiment uses two different biomedical ontologies on the same topic, anatomy, but for different species, the NCIT human anatomy and the mouse anatomy ontologies. Then the ontologies with respect to individual ontological similarity measures are compared. Section 5 provides a summary of this research, conclusions, and directions for future efforts. 2. Similarity measurement In the psychological literature two main approaches to assess similarity are content models and geometric or distance models. Distance models have been found to be contrary to human similarity judgments in psychological domains since satisfying the minimality, symmetry, and the triangle inequality axioms often conflicts with human similarity judgments and also such models require quantitative continuous dimensions where often human judgments of similarity use qualitative and discrete dimensions. The content models use characteristics which are conceptualized “as more or less discrete and common elements” to determine the similarity between objects [14]. Many of the proposed set theoretic measures in the content model category are generalized by Tversky’s parameterized ratio model of similarity [3]: STverksy−ratio (X , Y )
=
f (X
∩ Y) . f (X ∩ Y ) + α f (X − Y ) + β f (Y − X )
(1)
In the model, X and Y represent sets describing respective objects x and y. (X ∩ Y ) represents the common features that describe both x and y. (X − Y ) represents the features describing only object x and not object y. (Y − X ) represents the features describing only object y and not x. The value f (X ) for object x is considered a measure of the overall salience of that object. Factors adding to an object’s salience include “intensity, frequency, familiarity, good form, and informational content” [3]. The function f is an additive function on disjoint sets, i.e., whenever X and Y are disjoint sets and when all three terms are defined, then f (X ∪ Y ) = f (X ) + f (Y ). Set cardinality is such a function. Parameters α and β allow priority to be given to one object over the other, i.e., one object serves as the referent object to which the other object is being matched. If x is selected as the referent object, then α should be greater than β to emphasize that features describing x but not y are more important than those describing y but not x. This measure is normalized so that 0 ≤ STverksy−ratio (X , Y ) ≤ 1. With α = β = 1, STversky becomes the Jaccard index [3] Sjaccard (X , Y ) With α
=
f (X
∩ Y) . f (X ∪ Y )
(2)
= β = 1/2, STverksy−ratio becomes Dice’s coefficient of similarity [3]
Sdice (X , Y )
=
2 × f (X
∩ Y)
f (X ) + f (Y )
.
(3)
With α = 1, β = 0, STverksy−ratio becomes the degree of inclusion for X, that is, the proportion of X overlapping with Y [3]. Inclusion is not symmetric since α = β . Sinclusion (X , Y )
=
f (X
∩ Y)
f (X )
.
(4)
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
863
Tversky [3] also proposed an unnormalized similarity model, the contrast model. It integrates the same components but uses different mathematical operations: STverksy−contrast (X , Y )
= θ f (X ∩ Y ) − α f (X − Y) − β f (Y − X ).
(5)
Goodman [15] argues that assessing similarity between objects x and y is vague and meaningless without a “frame of reference.” He states " ’is similar to’ functions as little more than a blank to be filled." Asking the question "How similar are x and y?" begs the answer to a subtly different question "How are x and y similar?" [16]. Thus, crucial to similarity assessment is the method to select on what features similarity is being judged. For ontological similarity, the approach to “filling in the blank”, i.e., feature selection, for two concepts has varied greatly depending on the perceived objectives of the task or the problem domain. 3. Synthesizing ontological similarity Early developers of ontological or semantic similarity measures incorporated the ontological structure into their similarity measures. How these measures relate to other existing similarity measures such as the Tversky models [3] or fuzzy set compatibility or similarity measures [17,18] was not explored. Many of these historical ontological similarity measures can be viewed as materializations of such existing models of similarity when one examines how the “blanks” are filled in, i.e., how the features are selected to describe a concept in an ontology and how these features may have a fuzzy set-theoretic interpretation. Here the two major components, information content and a fuzzy set representation of an ontological concept are described as well as the integration of these two with Tversky’s models of similarity in order to derive several of the ontological similarity measures. 3.1. Information content in ontologies Information content (IC) of a concept in an ontology is a measure of how specific the concept is. The more specific (general) it is, then the higher (lower) is its IC. The earliest method to measure IC uses an external resource such as an associated corpus for the problem domain. The corpus-based IC measure for concept c [19] is given as ICcorpus (c )
= −log p(c )
(6)
The value p(c ), the probability of concept c, is determined using the frequency count of the concept, i.e. the count of the number of occurrences of the concept within the corpus. The frequency of a concept is the number of occurrences in the corpus of all words representing that concept. The frequency of the concept also includes the total frequencies of all its children concepts. The taxonomic structure of the ontology, i.e., the is-a relationships, determine the children of a concept. In some ontologies such as the GO and WordNet, however, the part-of relationship is also used in the calculation of the IC value. The probability p(c ) is calculated by dividing this total frequency count by the total number of words in the corpus. Because the formula is the negative logarithm of the probability, as the probability increases the information content decreases; therefore, concepts higher in the ontology which have a greater probability of occurring have less information content than those lower in the ontology. The ICcorpus values are not normalized and so can range from 0 to an upper value depending on the size of the corpus. A more recent method [20] uses the ontology structure and does not require an external resource, one of its advantages. Leaf concepts are most specific, they contain the most information. Root concepts are the least specific and contain the least information. The ontology-based IC [20] is defined as ICont (c )
= log
(num_desc (c ) + 1) maxont
log
1 maxont
=1−
log (num_desc (c ) + 1) log(maxont )
(7)
where num_desc(c) is the number of descendants for concept c and maxont is the maximum number of concepts in the ontology. It is normalized in [0 . . . 1] with the maximum of 1 for the leaf concepts and deceases to the minimum of 0 at the root concepts. An extended Information Content (eIC) measure [5] uses other relationships between concepts, not only those that create the taxonomic structure, the ones that historically have been used. The eIC(c) is a parameterized weighting of IC(c) and its total average relationship EIC(c). EIC(c) is the summation for each kind of relationship k, of the average of the IC(ci ) for all concepts ci that are “at the end of a particular relation” [5], i.e., k, with concept c. How non-taxonomic relationships and their inverses are handled, i.e., how is “at the other end of a particular relation” is not clear. If inverse relationships are not ignored or inverse relationships are not clearly identified, a circular calculation of eIC could occur. Another concern is how and which non-taxonomic relationships are selected. In the following experiments, only the well-known ICcorpus and ICont methods of determining information content are used. Since a research literature review has not produced any other experiments using the more recently proposed eIC
864
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
method and its calculation has not been explained in sufficient detail in [5], the experiments reported here do not is not attempt using eIC. 3.2. A synthesis of fuzzy set compatibility measures, information content, and Tverksy’s similarity models A concept within an ontology has many contexts and can be represented by its many different features, for example, its properties, its children, its parents, etc. [21]. The three standard IC ontological similarity measures Resnik [19], Lin [12], and Jiang–Conrath [22] use at most three ontological concepts within an ontology when determining similarity. Much more information, however, is conveyed in the structure of the ontology. Another view is to consider that each concept c can be represented by a fuzzy set. Which fuzzy set is used depends on how one proceeds in “filling in the blank” to select the elements of the set and what method is used to determine an element’s membership degree. For example, sets describing a concept could be its descendents or its ancestors. The membership degrees, for example, might be some function of the IC associated with concept in the set. Using these examples, the fuzzy set describing the concept c uses its ancestor set and the concept itself as Fanc+(c) (cj )
= {IC(cj )/cj |cj is an ancestor of c or c itself }
(9)
where the + indicates to include the concept c itself in the set. Here the function on IC is simply the identify function. This fuzzy set specifies each element cj and its respective membership IC(cj ) in the fuzzy set. Note either ICont or ICcorpus could be used. Here the set used to describe the concept c consists of other concepts, but the set might also be a set of links for a path associated with concept c and their memberships could be a produced by a normalized weighted distance function. When one represents an ontological concept by such a fuzzy set, then fuzzy set compatibility measures can be used to measure the similarity between two concepts’ fuzzy sets. That measure then represents the similarity between the two ontological concepts. Various methods to measure compatibility between fuzzy sets have been proposed [17,18]. One of the most famous is Zadeh’s partial matching index between two fuzzy sets F1 and F2 : Ssup−min (F1 , F2 )
= sup min(F1 (u), F2 (u))
(10)
where sup selects the supremum membership degree from the intersection of the two fuzzy sets. An early IC ontological similarity measure [19] is formulated as simRES (c1, c2)
= maxc in S(c1,c2) [IC (c )]
(11)
where S(c1,c2) is the set of concepts that subsume both c1 and c2 and IC is determined using a corpus. It can be seen as the partial matching index given in Eq. (10) where Fanc+(c1) and Fanc+(c2) represent the fuzzy ancestor sets for c1 and c2 respectively in the ontology and max or sup is the scalar evaluator. The minimum operation between Fanc+(c1) and Fanc+(c2) serves as the intersection operator between the two fuzzy ancestor sets and produces the set of all common ancestors for c1 and c2. The concept with the greatest membership degree, i.e., the maximum IC value, in the intersection of fuzzy sets Fanc+(c1) and Fanc+(c2) , therefore, represents the partial matching similarity between concepts c1 and c2. The minimum intersection operation is not needed in Eq. (11) since with the current methods of calculating IC, a concept only has at least one unique IC value within an ontology. The minimum would be performed on identical IC values for elements in the intersection of Fanc+(c1) and Fanc+(c2) . Typically, Zadeh’s partial matching or consistency measure assumes normalized fuzzy sets, i.e., each fuzzy set has at least one element with membership degree of 1. If satisfying this assumption is considered important, then the membership degrees, i.e., IC values for each ancestor and c1 can be divided by the IC value of c1 and similarly for the ancestors of c2. This normalization process results in different membership degrees for ancestor concepts in the two fuzzy sets. Then the minimum operator would be necessary when performing the intersection between the two fuzzy sets. Another option which follows the general formula for partial matching index [17] is to directly normalize the result of Eq. (11) by dividing it by the min(sup Fanc+(c1) , sup Fanc+(c2) ) = min (IC (c1), IC (c2)). For the experiments described in Section 4, each concept within an ontology has only one IC value and is used as a membership degree in a fuzzy set. In order to keep the membership degrees in the [0 1] range. The ICcorpus values were normalized by the greatest IC value specified in the ontology. Other experiments are needed to see how applying different functions to a concept’s IC value in determining a concept’s membership degree might affect performance of a normalized partial matching measure as a ontological similarity measure. Several other IC ontological similarity measure were developed due to a major criticism of the Resnik measure. It uses only the shared information between the two concepts and does not use their separate information. Jiang and Conrath [22] define a distance measure between ontological concepts. Their objective is to integrate path-based and information content methods for measuring similarity. Intuitively, the distance is based on totaling up their separate information contents and subtracting out twice the information content of their most informative subsumer c3, i.e., the one with the maximum IC value. distJC (c1, c2)
= IC (c1) + IC (c2) − 2 × IC (c3)
(12)
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
865
Whatever information content remains indicates the distance between them. If there is no IC left, i.e., 0, then the two concepts are the same. This distance measure can be converted to similarity. For example, in [20], the distance measure is first normalized and then converted to similarity by subtracting from 1. SimJC (c1, c2)
= 1 − (IC (c1) + IC (c2) − 2 × IC (c3))/2
(13)
Dividing by 2 is used to normalize the distance in [0, 1] since the maximum value for distJC is 2 with the ICont measure. This result occurs because the maximum ICont value is 1 and the minimum is 0. The following year after the distJC was proposed, Lin [12] defined a similarity measure that is basically normalizing the distJC measure and converting it into a similarity measure by subtracting from1. The Lin measure is given as simLin (c1, c2)
=
2 × IC (c3)
(14)
IC (c1) + IC (c2)
where c3 is the common ancestor with the greatest IC. The maximum distance distJC of 2 is based on the maximum and minimum values possible for ICont in general. However, one could normalize distJC not with a standard factor of 2 but instead by the maximum distance possible of IC (c1) + IC (c2) which occurs when IC (c ) is 0. Then one clearly sees the relationship of the Lin similarity measure to the Jiang-Conrath distance measure as simLin (c1, c2)
=1−
IC (c1) + IC (c2) − 2 × IC (c3) IC (c1) + IC (c2)
=
2 × IC (c3) IC (c1) + IC (c2)
.
(15)
Examining the Dice similarity measure given in Eq. (3), one sees a form that is similar to the Lin measure in Eq. (14) if one simply substitutes f(X) = IC(c1) and f(Y) = IC(c2) and interprets f(X ∩ Y) as IC (c3). This approach has been proposed in [7] with the goal of integrating information-theoretic and set-theoretic similarity measures. As stated previously, f is a measure of the salience of an object, and “informational content” is given as a factor “adding to an object’s salience” [3]. However, the interpretation of Tversky’s model is somewhat misleading in simply replacing f with IC and assuming that the common subsumer or ancestor represents the intersection of the set of features of c1 and the set of features of c2. What the sets of features are and how the IC measure provides an additive function on the set of features as specified by Tversky’s similarity model is not explicitly specified. Both forms of IC measures, for example, are based on a function incorporating all of the descendents of a concept. Using IC for f , intuitively suggests the intersection then is not represented by the common ancestor c3 but by some common descendent. It simply is not clear what the actual features of concept c1 and c2 are and what explicitly is the intersection of these features. In [6] Dice’s coefficient given in Eq. (3) has been shown to be the basis for both the Wu-Palmer path-based semantic similarity measure [23] given as simWu−Palmer (c1, c2)
=
2 × len(root , c3) len(c1, root )+len(c2, root )
(16)
and the Lin information-content based semantic similarity measure. Dice’s coefficient establishes the connection between the Lin and Wu-Palmer path based similarity measure. In this interpretation, each concept is represented by the set of edges making up the path from the concept to the root. The edges in the path from the common ancestor c3 to the root are the edges that both concepts c1 and c2 have in common, i.e., the path from c3 to the root, the intersection of the two edge sets for c1 and c2. Wu-Palmer has been shown equivalent to the Lin measure when each edge is weighted by the difference in the child’s and parent’s IC values [6]. If c1 and c2 have multiple common subsumers, c3 is typically selected as the lowest common subsumer, i.e., the one with the greatest distance from the root, i.e., deepest in the ontology. This approach somewhat assumes that this c3 will also produce the shortest distance between c1 and c2. This is not always the case due to multiple inheritances as discussed in [24] where several different approaches to selecting c3 were used in experiments with the GO. In that study, selecting the deepest common subsumer of c1 and c2 produces the greatest mean similarity for all GO subontologies, closely followed by selecting the c3 that minimizes the distance between c1 and c2. The Pearson correlation between these two methods was greater than 0.98 for each of three GO subontologies. In the tables here when Wu-Palmer results are reported, c3 is selected as the one that minimizes the path distance between c1 and c2 since path-based measures have concentrated on finding the shortest path between c1 and c2. In most cases, there is not a conflict; that is, the c3 that produces the shortest path typically is the one that is at the greatest depth. Many fuzzy set compatibility measures are generalizations of Tverksy’s parameterized ratio model when fuzzy set cardinality, an additive set function, is used for f , X and Y are replaced by fuzzy sets F1 and F2 , and the crisp set operators are transformed to the corresponding fuzzy set operators [17]. The Jaccard fuzzy set compatibility measure can be used to calculate IC ontological similarity between two concepts represented by their fuzzy sets of ancestors Fanc+(c1) and Fanc+(c2) . The set intersection uses a t-norm, typically minimum, and the set union uses a t-co-norm, typically maximum. The Jaccard
866
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
ontological similarity measure with the anc+ set used to ‘fill in the blank’ then becomes simJacAnc (c1, c2)
c ∈Fanc+(c1) ∩Fanc+(c2)
IC (c )
c ∈Fanc+(c1) ∪Fanc+(c2)
IC (c )
=
(17)
Again the min and max fuzzy set operators do not need to be used because the membership degrees of the ancestors in the fuzzy sets representing both concepts c1 and c2 are simply the ancestor’s IC value. The IC value of the concept in each fuzzy set is the same since it is a function of its number of descendents. As previously suggested, however, these IC values could be normalized so that the membership degree of a common ancestor in each concept’s fuzzy set could differ. For this approach, the min and max operators would then be needed since the IC membership degrees then differ. Ontological similarity could also be modified by describing a concept using a different set; for example, instead of its ancestor set, a neighborhood set of related concepts within a specified number of edges from the concept could be used. A wide variety of fuzzy set compatibility measures, many of which are fuzzy generalizations derived from Tversky’s parameterized ratio model of similarity, can be and have been used as ontological similarity measures. With this model, there still can be variations depending on how each researcher decides to approach “filling in the blank”, i.e., the objectives that determine exactly what set is used to describe an ontological concept and what method is used to assign its membership degrees in the set. 3.3. Recent ontological similarity based on intuitive uses of Tversky’s models New ontological similarity measures [4,5] which have recently been proposed use an intuitive interpretation of Tversky’s contrast and parameterized ratio models. As seen in [7], these measure make the assumption that the function f in Tversky’s models can be simply replaced with an IC measure. This assumption was used in [7] to show that the Lin semantic similarity measure is an example of Tversky’s parameterized ratio model. This interpretation of f differs from the fuzzy set theoretic and information-content modeling of ontological similarity in [6] and presented in Section 3.2 where a concept is explicitly represented by a set of elements, i.e., related concepts, and a function of IC is used as the membership degree of an element in the fuzzy set. The P&S semantic similarity measure [4] uses the contrast model to formulate simP&S (c1, c2)
= 3 × (IC (c3)) − IC (c1) − IC (c2) if c1 = c2, = 1 if c1 = c2
(18)
where c3 is the common ancestor concept with the greatest IC. It is argued that this formula represents the information theoretic counterpart of Tversky’s set theoretic contrast model. The assumption made is that f(X ∩ Y) is the same as IC (c3) where X represents a set of features describing c1 and Y represents a set of features describing c2. Similarly, f(X – Y) and f(Y – X) are replaced by (IC (c1) − IC (c3)) and (IC (c1) − IC (c3)) respectively. The parameters are set as θ = α = β = 1 to produce IC (c3) − (IC (c1) − IC (c3)) − (IC (c2) − IC (c3))
(19)
which simplifies to the measure given in Eq. (18) when c1 = c2. When c1 = c2, then IC (c1) = IC (c2) = IC (c3); therefore, the result of Eq. (18) is IC (c3) so they assign the value of 1 instead if the two concepts are the same. In [24] an investigation of the P&S measure was done mathematically and empirically and showed that the P&S measure can produce negative values. Negative values are produced, for example, when two leaf concepts have only the root as their common ancestor c3 since the IC of a root concept is 0. These values cause difficulties in understanding and interpreting the similarity values produced by the P&S measure. Also, the measure P&S measure had the worst correlation with the historical ontological similarity measures in a majority of the performed experiments. In [5] the FaITH ( Feature and Information Theoretic) measure is proposed. It still assumes f as simply the IC measure and uses the same mappings for f(X ∩ Y), f(X – Y), and f(Y – X) as for the P&S measure. The only change is to use Tversky’s parameterized ratio model of similarity instead of the contrast model. These mappings are substituted into Eq. (1) with α = β = 1 to produce simFaITH (c1, c2)
=
IC (c3) IC (c1) +IC (c2) −IC (c3)
(20)
which is very similar to the simLin measure in Eq. (14). The FaITH measure can be produced from Eq. (14) by subtracting IC (c3) from both the numerator and denominator of Eq. (14). With this modification, the simFaITH measure must always produce a smaller than or equal to (only when both are 0) value compared to simLin . As IC(c3) approaches 0, the ratio of simFaITH /simLin approaches 0.5. There is a strong correlation between these two measures as the empirical investigations show in the next section. In both simP&S and simFaITH , the key assumption is that an information theoretic view of Tversky’s models of similarity is possible by the simple substitution of IC for the function f . The f(X ∩ Y) which is a measure of the amount of shared elements, i.e., features, between sets X and Y in set-theoretic model is simply replaced by IC (c3). With this approach c1 and c2 are not
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
867
explicitly represented by a set of features and the intersection between two concepts is defined as the most specific ancestor concept. A difficulty with this interpretation is the difference between a set of features representing the intersection of two sets such as the shared set of properties between c1 and c2 and the shared information between c1 and c2 as measured by the IC of their most specific ancestor c3. For two sets X and Y , a function f of their intersection set does not change if more sets share the same intersection set with X and Y, i.e., they have the same set of shared features. As explained in Section 3.1 IC (c3), however, representing a measure of shared information between c1 and c2, does not solely depend on IC (c1) and IC (c2) but is affected by the IC of all of the descendents of c3. These descendents of c3 also have the same shared information with c1and c2. The amount of this shared information should not decrease simply because the number of descendents of c3 increases. This difference makes a case against a simple substitution of IC (c3) for f(X ∩ Y) in both Tversky’s set models of similarity given in Eqs. (1) and (5) . This section has proposed fuzzy set compatibility measures, many based on the Tversky ratio model of similarity, that can be combined with information content measures to produce a general model for a wide variety of ontological similarity measures. In representing an ontological concept as a fuzzy set, the key considerations are what set is used and what method is used to determine the degree of membership for each element in the set. To further explore this model, the following empirical study includes the simJacAnc along with the two historical Resnik and Lin measures and the two recently proposed measures P&S and FaITH.
4. Investigations using bioinformatics and biomedical ontologies The bioinformatics and the biomedical domains are serving as a primary impetus for the creation of new ontological similarity measures [2]. Two experiments using several different ontologies investigate the relationships among several ontological similarity measures and are presented in the following sections. An objective of the experiment is to use ontological similarity measures in several ontologies from the same and different domains that have varying structural characteristics. The Gene Ontology (GO) [9] is an important ontology used in the bioinformatics domain. It contains concepts used to annotate genes or gene products. The GO contains three mutually exclusive subontologies, that is, all are separate ontologies with no links between them. They are the biological process (BP), cellular component (CC) and molecular function (MF) ontologies, each with varying sizes and different structure. Two other ontologies from the biomedical domain, the NCIT human anatomy (HA) and the mouse anatomy (MA) are used in a second experiment. These two ontologies have been consistently used for one track of the Ontology Alignment Evaluation Initiative (OAEI) [26]. The performance of ontological similarity measures have typically been assessed based on their correlation to human similarity judgments or an experimentally developed gold standard with a very specific task objective. The two experiments described below differ from previous ones in that a direct comparison approach is used. This approach does not require an arbitrary gold standard for performance comparisons. The objective is to better understand the relationships among the measures and the consistency of the measures across several ontologies. Another advantage is that only the ontological similarity values are being compared and no other operation besides the similarity assessment is involved. For example, in the bioinformatics domain, one of the gold standards for comparing ontological similarity measures is the correlation of gene similarity computed using ontological similarity measures to actual gene sequence similarity, in effect, a gold standard in the bioinformatics domain [1]. Computing gene similarity with ontological similarity measures requires an aggregation operator. A wide variety of aggregation operators have been used to produce the final gene pair similarity. The selection of different aggregation operators does influence the final gene pair similarity and consequently the correlation to gene sequence similarity. The focus in the experiments reported here is the ontological similarity measures themselves within the ontologies so that the need for the selection of various aggregation operators can be eliminated from the comparison process. As previously discussed, two primary ways have been used to determine the IC of a concept: corpus-based IC given in Eq. (6) and ontology or structure based IC given in Eq. (7). For the GO, determining the IC using a corpus-based approach involves selecting a set of gene annotation files to use as the corpus and performing a frequency count for each GO term. For the NCIT human anatomy and the mouse anatomy ontologies, a set of biomedical documents would need to be selected as a corpus for each and then a frequency count taken for each concept for each ontology. Research reported in several studies [4,5,20,25], however, suggest that ontology-based IC is highly correlated with corpus-based IC and in some experiments that correlate ontological similarity to human similarity judgments, the ontology-based IC version of the ontological similarity measure performed as well and sometimes better than corpus-based IC version. In the following discussion of the GO experiments, first the issue of ontology-based IC versus corpus-based IC is examined by comparing the two versions of five IC ontological similarity measures: the standard Resink (R) and Lin (L) measures and the newer ontological similarity measures P&S (P), FaITH (F), and JacAnc (J) proposed in Eq. (17). 4.1. The gene ontology experiments The GO for these experiments was created using the OWL files for the three subontologies provided at http://purl.org/ obo/obo-all/, for example, the cellular_component/cellular_component.owl. Files for the biological_process and molecular_function exist at the same site. The GO includes several different kinds of relationships, but the two major types of
868
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
Fig. 1. Pearson correlation between ICcorpus and ICont versions.
Fig. 2. Spearman correlation between ICcorpus and ICont versions.
structuring links are the “is-a” and “part-of”. For determining ICont for the subontologies these two types of relationships were used and is the standard practice in experiments with the GO [10]. The three subontologies vary in size and structure. The smallest is the CC subontology with 2636 concepts, of which 1724 are leaf concepts (approximately 65% leaves). The MF ontology is of medium size with 8668 concepts, of which 6956 are leaf concepts (about 80.2%). The BP ontology is the largest with 18059 concepts, of which 8442 are leaf concepts (about 46.7%). The MF ontology is more single parent structured averaging only 1.2 parents per concept while the CC and BP average 1.9 parents per concept. Due to the large sizes of these subontologies, a set of GO concepts are randomly selected from each subontology. Around 5% of each subontology is selected: 100 for CC, 500 for MF and 880 for BP. Similarities between all pairs of selected concepts within a subontology are calculated resulting in 5050, 125250, and 387640 concept pairs, respectively. First, experiments were undertaken to examine the relationship between ICcorpus and ICont when used in an IC ontological similarity measure. To calculate ICcorpus the GOA-UniProt Version 79 database (http://www.ebi.ac.uk/GOA/) was used. Both the Pearson and Spearman coefficients are presented. The Pearson coefficient measures the degree of linear relationship between two variables. The assumption is each variable is approximately normally distributed. The Spearman coefficient assumes that the variables under consideration are measured on at least a rank order or ordinal scale. The Spearman coefficient can be viewed as the Pearson coefficient in terms of proportion of variability accounted for. Only the ranks of the observations for each variable are used to calculate the Spearman coefficient. The Spearman correlation coefficient produces a 1 when the two variables being compared are monotonically related and does not require the strict linear relationship of the Pearson correlation. Fig. 1 shows the Pearson correlation between the ICcorpus and ICont version of the listed ontological measures on the x-axis: R-Resnik, L-Lin, F-FaITH, J-JacAnc, andW-Wu-Palmer. For example, the Resnik measure with ICcorpus and the Resnik measure with ICont have a correlation of slightly under 0.9 for the BP and slightly over 0.9 for the CC. Fig. 2 shows the Spearman correlation between the ICcorpus and ICont versions.. This investigation shows a very high correlation between the ontology-based and the corpus-based versions of each of the IC ontological similarity measure although the intuitively designed P&S measure had the lowest correlations across all subontologies. The newly proposed JacAnc measure which is based on Tversky’s parmaterized ratio model and a fuzzy set interpretation of an ontological concept produces the highest Pearson correlation across all subontologies. The Spearman correlation when using the two different IC measures is almost perfect for the MF and BP subontologies for all ontological similarity measures except the P&S measure.
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
869
Fig. 3. Means for each GO subontology across measures.
Fig. 4. Mean for each ontological similarity across GO subontologies.
With FaITH the eIC discussed in Section 3.1 was not used. In [5] simFaITH + eIC always produced higher Pearson correlations with human similarity judgments than any of the others ontological similarity measures. The paper seems to indicate, however, that the other IC ontological similarity measures were only tested using ICont . Since the specific kinds of relationships and the algorithms used to calculate eIC are not clearly stated, only ICont and ICcorpus versions of the IC ontological similarity measures have been compared. Another experiment in [5] uses a data set consisting of 36 concept pairs from the MeSH (Medical Subject Headings) ontology to evaluate the performance of simFaITH on a domain related ontology. The conclusion again was that simFaITH had the highest correlation with the human similarity judgments. It is understood that for MeSH all ontological similarity measures including simFaITH were calculated using ICont . As previously discussed, the simFaITH measure should always produce a smaller or equal value to that of simLin . However, a problem in either the calculation of one or both of these measures is seen in [5] since in the reported similarity values for the 36 pairs, 14 of the simFaITH values are greater than the corresponding simLin values. In the following only the ontology-based versions of the IC ontological similarity measures are compared along with the addition of one traditional path-based measure, the Wu-Palmer measure. It is added for comparison purposes. The WP measure is calculated with edge weights all assigned 1.0 value, i.e., no information content measure is used in assigning edge weights. In Fig. 3 each line represents a specific subontology and each point on the line is the mean of the ontological similarity measure listed on the x-axis. Since the P&S measure produces negative values and is to a different scale, it is not included in the graph. The y-axis is the mean value of the ontological similarity measure. The CC subontology has the greatest mean value for each of the ontological similarity measures. The BP subontology produces slightly higher values than the MF except for the JacAnc measure where the values are quite close and for the Wu-Palmer measure where the BP value is approximately half that of the MF subontology. Although not shown, the P&S values follow the same pattern with the CC having the greatest value followed by the BP and then the MF. In Fig. 4, each line represents an ontological similarity measure. A point on the line is the mean value in the subontology on the x-axis. The Wu-Palmer path-based measure produces the greatest value across all three subontologies. The Lin and Resnik measure are nearly identical. As pointed out in the discussion on the FaITH measure, FaITH measure always produces a value less than the Lin measure as evidenced in the graph. The JacAnc measure produces the smallest value across all three ontologies. Performance evaluations of ontological similarity measures to human similarity judgments typically use the Pearson correlation coefficient. Here both the Pearson and Spearman coefficients are used to measure correlations among the IC ontological similarity measures themselves, not to some task or experimentally developed gold standard such as human similarity judgment. For this investigation, each correlation coefficient calculated had a p-value < 0.001, which indicate that the result is significant. Figs. 5–7 show the Pearson coefficient for the CC, MF, and BP, respectively. Figs. 8–10 show the Spearman coefficient. The y-axis is the coefficient value; the x-axis is the respective ontological similarity measure. Each graph line shows how its
870
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
Fig. 5. Pearson correlation between measures for CC.
Fig. 6. Pearson correlation between measures for MF.
Fig. 7. Pearson correlation between measures for BP.
Fig. 8. Spearman correlation between measures for CC.
labeled ontological similarity measure correlates with each of the ontological similarity measures listed on the x-axis. For example, the line labeled R–P in Fig. 5 shows the Pearson correlation of the Resnik measure with the other measures listed on the x-axis. For both coefficients, the Resnik measure correlates best with the Lin and FaITH measures across all three subontologies. The FaITH measure is strongly connected to the Lin measure. It simply reduces both the numerator and denominator of the Lin measure by the IC value of the ancestor with the greatest IC. The experiments verify this showing that these two measures have 1.0 correlation with each other across all subontologies for the Spearman coefficient and have approximately the same high Pearson correlation, 0.97, across the three subontologies. Across all three subontologies, the Spearman correlation for the Lin measure with each the other three ontological similarity measures is identical to that of the corresponding Spearman correlation for FaITH. Resnik correlates (Pearson) about the same with P&S and JacAnc for CC but then it correlates slightly better with the P&S for both MF and BP. The Resnik has the worst Spearman correlation with the P&S measure across all subontologies. The correlation results for Lin parallel those for the Resnik measure across all three subontologies for both correlation coefficients. The smallest correlation between the Lin and Resnik measures is 0.988. This result occurs because when c1 and c2 are leaf nodes, their ICont value is 1, the Resnik and Lin measures produce identical values, i.e., simRes(c1, c2) = simLin(c1, c2) = 2IC(c3)/(1 + 1) =IC(c3).
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
871
Fig. 9. Spearman correlation between measures for MF.
Fig. 10. Spearman correlation between measures for BP.
Fig. 11. Pearson correlation between path-based WP and other IC ontological similarity measures.
The FaITH measure correlates (both coefficients) best with the Resnik and Lin measures for all subontologies. It has a much higher Spearman correlation with the JacAnc measure than with the P&S measure for all subontologies. Only for Pearson coefficient and the BP does the FaITH correlate slightly better with the P&S than with the JacAnc. In general Pearson correlations across all subontologies show the same patterns. P&S and JacAnc have the poorest correlation with the other three measures though there is a flip-flop between the P&S measure and the JacAnc measure for the Lin and FaITH measures for the CC and BP. The worst correlation for P&S occurs with JacAnc. In general Spearman correlations across all subontologies show the same line patterns for all ontological similarity measures except the P&S measure. It clearly correlates less with all the other measures. The distinction in the correlation values for the other four decreases going from CC to MF to BP. For the BP, for all measures except P&S the lines are nearly identical. This result could be due to the increasing number of concept pairs from CC to MF to BP. Generally, Resnik, Lin and FaITH are strongly correlated. The Lin and FaITH produce identical results with respect to the Spearman coefficient. From these experiments, the lowest correlation between these three measures occurs between FaITH and Resnik for the MF with a 0.961. Fig. 11 shows Pearson correlation and Fig. 12 the Spearman correlation of the path-based measure WP with three of the IC ontological similarity: Resnik, Lin and JacAnc for each of the GO subontologies. The two intuitive measures the P&S and the FaITH were not correlated with the WP measure since the P&S measure was found above to correlate the worst with the other IC based measures and the FaITH has been shown to be highly correlated with the Lin measure due its being a simple mathematical modification of the Lin, i.e., the same factor is subtracted from both the numerator and denominator of the Lin measure. The two figures show that the WP measure is highly correlated with the three IC ontological similarity measures within the MF subontology, although more so for the Spearman coefficient which also shows high rank correlation in the CC subontology. Across all subontologies, the JacAnc measure does not have as high linear correlations with the WP as the other two measures. For the Spearman correlation with the WP, the JacAnc measure is consistent with that of the Resnik and Lin measures across all subontologies. The MF has the highest percentage of leaves and also has more of a single parent structure. More investigation is needed to understand how the subontology structure influences the correlation between a path-based measure such as the WP and the IC-based ontological similarity measures. In the biomedical domain, IC ontological similarity measures, however, have become effectively the standard [1,2].
872
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
Fig. 12. Spearman correlation between path-based WP and other IC measures. Table 1 Mean similarity value in the mouse and human anatomy ontologies. R L J W
MA 0.014 0.015 0.007 0.018
HA 0.071 0.074 0.025 0.315
4.2. Experiments with the NCIT human anatomy and the mouse anatomy Ontology alignment takes as input a source ontology Os and finds for each of its entities es (concept, relation, or instance) an entity et in the target ontology Ot that is closely related to it or has the same meaning [27]. The OAEI conducts yearly competitions with the most up-to-date ontology alignment (OA) systems. These systems and their algorithms are evaluated using the same set of test cases so that performance comparisons can be made by those interested in using these OA systems and developers can improve their OA systems based on these evaluations. The use of these two ontologies as two real-world ontologies for the OAEI competition recommends their use in these experiments with ontological similarity measures. The standard method for the evaluation of OA results requires a gold standard reference alignment which has typically been created by a set of human domain experts. It is considered to be a correct and complete set of mappings between the two ontologies. In the experiments with the GO subontologies, random pairs of concepts within a subontology were selected for which to measure ontological similarity. In the experiments with the MA and HA ontologies, the reference alignment between the two ontolgies was used to determine the concepts selected in each ontology. The reference alignment has 1520 mappings between MA concepts and HA concepts. Each of the 1520 concepts in the MA and in the HA ontologies were selected to create the (1520*1519)/2 or 1154440 pairs of concepts for which ontological similarity is determined. The mean values for each of the four selected ontological similarity measures R-Resnik, L-Lin, J-JacAnc, W-Wu-Palmer for both the MA and HA ontologies are shown in Table 1. For the MA there is little difference in three of the average except for the JacAnc measure which has a very small average. In the HA, however, the Wu-Palmer has a much greater similarity average. The order of the mean values is the same for the MA and HA ontologies: J, R, L, W. The mean similarity is higher in HA than that in MA. This result is determined by the structural difference between those two ontologies. The MA is much broader than the HA with 1057 branches at level 1 from the root, compared to only seven branches for the HA. This small number of level 1 concepts leads to a high probability that two concepts being measured are located within the same sub-tree. Thus, any similarity measure which includes a common ancestor which is not the root generates a nonzero similarity value. Fig. 13 shows the Pearson correlation between the selected ontological similarity measures: R-Resnik, L-Lin, J-JacAnc, and W-Wu-Palmer for the MA ontology. All Pearson correlations are above the 0.85 value. The Resnik and Lin measures follow the same pattern of correlation with the JacAnc and the Wu-Palmer measures. They correlate the worst with JacAnc measure. The Spearman correlation for the MA is not graphically shown because rounding to three decimal points produces a 1.0 for Spearman correlation among the four measures. Fig. 14 shows the Pearson correlations for HA ontology. All Pearson correlations are above the 0.80 value. Again the Resnik and Lin measures follow the same pattern of correlation with the JacAnc and the Wu-Palmer for the HA, but unlike the MA, they correlate worst with the WP. Five different ontologies have been used in various experiments to measure ontological similarity between concepts. Fig. 15 shows the mean values calculated for each selected measure across all the ontologies. Certain patterns can be seen. For example, the greatest values for all four measures are produced in the CC ontology. The smallest values for all four measures are produced in the MA ontology. Then variations exist across the measures in the ordering of values for the remaining three ontologies.
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
873
Fig. 13. Pearson correlation between measures for MA.
Fig. 14. Pearson correlation between measures for HA.
Fig. 15. Mean values across five ontologies using Lin, Resnik, JacAnc and WP.
For example, the WP and JacAnc measures have the following descending order for the remaining three ontologies: HA, MF, BP while the Resnik and Lin measures have the following descending order: BP, HA, MF. The rank in descending order of the ontological similarity measures is consistent across all five ontologies: WP, Lin, Res, and JacAnc. The Res and Lin measures produce very comparable values within each of the five ontologies. Fig. 16 shows the Pearson correlation between the pairs of ontological similarity measures across all five ontologies. The correlation for the Lin and Resnik (L-R) pair is always the strongest across all those five ontologies, while the correlation for the JacAnc and WP (J-W) pair is the weakest in all ontologies except the MA. The J-W pair has the highest correlation in the anatomy ontologies. The WP also has higher correlations with the Resnik and Lin measures in the anatomy ontologies than in the GO subontologies. In general, the MA and HA, have high correlation for all pairs of the four similarity measures with all greater than 0.8. The MF follows a similar pattern as the MA and HA except for the J-W pair which has slightly under a 0.7 value. In general the BP has the lower correlations across all pairs of measures. The CC has lower correlations for pairs containing the Wu-Palmer measure. More work is needed to understand the possible causes of these patterns as they relate to the structure of the ontologies and the weaknesses and strengths of the various measures.
874
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
Fig. 16. Pearson correlation for pairs of measures for the five ontologies.
5. Conclusions and future work The motivation of the study is to establish a general fuzzy set-theoretic framework of IC ontological similarity that has as its basis Tversky’s similarity models, fuzzy set compatibility measures, and information content (IC) and to examine other recently proposed measures that use an intuitive IC version of Tversky’s model [4,5]. Historical ontological similarity measures are shown to be examples of fuzzy set compatibility measures when a concept is described by a fuzzy set of elements and their membership degrees are determined as a function of IC. The empirical study uses two historical IC measures Resnik [19] and Lin [12], two recently proposed measures, P&S [4] and FaITH [5], and JacAnc, a measure presented here and derived from the general fuzzy set-theoretic framework for IC ontological similarity. One standard path-based measure Wu-Palmer (WP) [23] is used for comparison purposes in the empirical studies. First the difference between corpus-based information content ICcorpus and ontological based information content ICont is examined using the three Gene Ontology (GO) subontologies: the cellular component (CC), the molecular function (MF) and the biological process (BP). The results agree with previous studies showing a strong correlation between the two forms of IC measure. The three GO subontologies are used to empirically compare the five ICont ontological similarity measures and then the WP measure. Then two other ontologies the NCIT Human Anatomy (HA) and the Mouse Anatomy (MA) are also used to investigate the relationships between the ontological similarity measures. The comparisons between the ontological similarity measures do not use a gold standard but instead use the correlation between their values on sets of randomly selected concept from each of the three GO subontologies and on sets of concepts based on a reference alignment between the HA and MA ontologies. The FaITH and Lin measures are shown to have a mathematical relationship validated by their very high Pearson correlation and identical Spearman correlations values. The Resnik and Lin measure are strongly correlated. This result can be partly explained by the equivalence of the two measures when the similarity between leaf concepts is being determined. The JacAnc measure shows overall the strongest correlation between its ICont and ICcorpus versions over all subontologies although each measure shows strong correlation between its two IC versions. It appears for JacAnc that incorporating more information to represent a concept, i.e. a concept’s set of ancestors, mitigates the difference in using ICont in place of ICcorpus . The correlation for the P&S ICont and ICcorpus version is the smallest. Across all five ontologies the Wu-Palmer produced the highest values and the JacAnc produced the smallest values. The rank of the measures in descending order was consistent across all five ontologies: Wu-Palmer, Resnik, Lin, and JacAnc. Current research is exploring the use of several of these ontological similarity measures in various tasks, for example, ontology alignment, in order to further compare their performance [28].
References [1] V. Cross, Ontological Similarity, in: M. Popescu, Dong. Xu (Eds.), Data Mining in Biomedicine Using Ontologies, Artech House, MA, Norwood, 2009, pp. 23–43. [2] C. Pesquita, D. Faria, A.O. Falcão, P. Lord, F.M. Couto, Semantic Similarity in Biomedical Ontologies, PLoS Comput Biol 5 (7) (2009) e1000443, 10.1371/journal.p(c)bi.1000443. [3] A. Tversky, Features of Similarity, Psychological Rev 84 (1977) 327–352. [4] G. Pirrò, N. Seco, Design, Implementation and evaluation of a new semantic similarity metric combining features and intrinsic information content, in: OTM Conferences, 2008, pp. 1271–1288. [5] G. Pirrò, J. Euzenat, A feature and information theoretic framework for semantic similarity and relatedness, in: Proceedings of International Semantic Web Conference, vol. 1, 2010, pp.615–630. [6] V. Cross, Tversky’s parameterized similarity ratio model: a basis for semantic relatedness, in: Proceedings of the 2006 Conference of North American Fuzzy Information Processing Society (NAFIPS), June 3–6, Montreal, Canada, 2006. [7] L. Cazzanti, M.R. Gupta, Information-theoretic and Set-theoretic similarity, in: Proceedings of IEEE International Symposium on Information Theory, 2006. [8] A. Budanitsky, G. Hirst, Semantic distance in WordNet: an experimental, application-oriented evaluation of five measures, in: Workshop on WordNet and Other Lexical Resources, Second meeting of the NAACL, Pittsburgh, 2001. [9] The Gene Ontology Consortium,
. [10] P. Lord, R. Stevens, A. Brass, C. Goble, Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation, Bioinformatics 19 (2003) 1275–1283. [11] M. Wei, An Analysis of Word Relatedness Correlation Measures, Master’s thesis, University of Western Ontario, London, Ontario, May 1993.
V. Cross et al. / International Journal of Approximate Reasoning 54 (2013) 861–875
875
[12] D. Lin, An information-theoretic definition of similarity, Proceedings of the 15th International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA, 1998, pp. 296–304. [13] V. Cross, Xinran. Yu, Investigating ontological similarity theoretically with fuzzy set theory, information content, and tversky similarity and empirically with the gene ontology, in: Proceedings of the Fifth International Conference on Scalable Uncertainty Management (SUM 2011), Dayton, OH, 2011. [14] F. Attneave, Dimensions of Similarity, American J. of Psychology 63 (1950) 516–-556. [15] N. Goodman, Seven strictures on similarity, in: N. Goodman (Ed.), Problems and projects, Bobbs-Merrill, New York, 1972, pp. 437–447. [16] D.L. Medin, R.L. Goldstone, D. Gentner, Respects for Similarity, Psychological Review 100 (2) (1993) 254–278. [17] V. Cross, An Analysis of Fuzzy Set Aggregators and Compatibility Measures, Ph.D. Dissertation, Computer Science and Engineering, March 1993, Wright State University, Dayton, OH, 1993, 264 pages. [18] V. Cross, T. Sudkamp, Similarity and Compatibility in Fuzzy Set Theory: Assessment and Applications, Physica-Verlag, New York, 2002., ISBN 3-7908-1458. [19] P. Resnik, Using information content to evaluate semantic similarity in taxonomy, in: Proceedings of the 14th International Joint Conference on Artificial Intelligence, 1995, pp. 448–453. [20] N. Seco, T. Veale, J. Hayes, An intrinsic information content metric for semantic similarity in WordNet, in: ECAI, 2004, pp. 1089–1090. [21] M.A. Rodriguez, M.J. Egenhofer, Determining Semantic Similarity among Entity Classes from Different Ontologies, IEEE Transactions on Knowledge and Data Engineering 15 (2) (2003) 442–456. [22] J. Jiang, D. Conrath, Semantic similarity based on corpus statistics and lexical taxonomy, in: Proceedings of the 10th International Conference on Research, 1997. [23] Z. Wu, M. Palmer, Verb semantics and lexical selection, in: Proceedings of the 32nd Annual Meeting of the Association for Computational Ling, Las Cruces, NM, 1994, pp. 133–138. [24] Xinran. Yu, A Mathematical and Experimental Investigation of Ontological Similarity Measures and their Use in Biomedical Domains, Master’s Thesis, Computer Science and Software Engineering, Miami University, Oxford OH, 2010. [25] V. Cross, Y. Sun, Semantic, Fuzzy Set and Fuzzy Measure Similarity for the Gene Ontology, Proceedings of the IEEE International Conference on Fuzzy Systems, Imperial College, London, 2007 [26] J. Euzenat, et al., The Results of the Ontology Alignment Evaluation Initiative 2010. Ontology Matching Workshop, International Semantic Web Conference, Shanghai, 2010. [27] M. Ehrig, Ontology Alignment: Bridging the Semantic Gap, Springer Science+Business Media, LLC, 2007. [28] V. Cross, Hu, Xueheng, Using Semantic Similarity in Ontology Alignment. Ontology Matching Workshop, International Semantic Web Conference, Bonn Germany, 2011.