Guideline based evaluation and verbalization of OWL class and property labels

Data & Knowledge Engineering 69 (2010) 331–342 Contents lists available at ScienceDirect Data & Knowledge Engineering journal homepage: www.elsevier...

Download PDF

323KB Sizes 0 Downloads 17 Views

Report

PDF Reader
Full Text

Data & Knowledge Engineering 69 (2010) 331–342

Contents lists available at ScienceDirect

Data & Knowledge Engineering journal homepage: www.elsevier.com/locate/datak

Guideline based evaluation and verbalization of OWL class and property labels Günther Fliedl *, Christian Kop, Jürgen Vöhringer Institute of Applied Informatics, Alpen-Adria-Universität Klagenfurt, Universitätsstrasse 65-67, 9020 Klagenfurt, Austria

a r t i c l e

i n f o

Article history: Available online 23 August 2009 Keywords: Ontology engineering OWL class labels Ontology verbalization Linguistic guidelines

a b s t r a c t The ontology language OWL has become increasingly important during the previous years. However due to the uncontrolled growth, OWL ontologies in many cases are very heterogeneous with respect to the class and property labels that often lack a common and systematic view. For this reason we developed linguistically based guidelines for OWL class and property labels focusing on their implicit structure. Considering these guidelines we propose an evaluation mechanism including rules for comparing the linguistically triggered label interpretations to their OWL internal representations. Our proposal also includes the verbalization of these evaluated OWL labels. Ó 2009 Elsevier B.V. All rights reserved.

1. Introduction The importance of ontologies has grown signiﬁcantly during the previous years. Web Ontology Language (OWL) [11] is a W3C recommendation for the semantic web and now very commonly used for representing knowledge provided by domain experts in enterprises as well as research groups. OWL ontologies are based on RDF Schemata [15] and provide a speciﬁc XML representation [16] of classes and hierarchies from a speciﬁc domain. Adapting XML and RDF to the OWL format offers an extended structure and formal semantics in order to store expert knowledge of a certain domain by describing classes, properties, relations, cardinalities, etc. One big advantage of using a ‘‘formal” ontology like OWL for describing domain vocabularies is that they can be automatically machine-interpreted which makes further knowledge processing easier. Because of the uncontrolled growth, OWL ontologies have become very heterogeneous and therefore hard to integrate from a generic viewpoint. In particular OWL ontologies are very hard to understand for human readers who have to decide whether an ontology fulﬁlls their expectations. Thus speciﬁc OWL ontologies are commonly criticized for being difﬁcult to reuse, to transform to other domains or languages and to integrate with other ontologies. Therefore approaches like Attempto [5] or Swoop [6] try to represent these ontologies in a human readable and understandable format. According to the speciﬁc semantics of OWL tags (e.g. subClassOf, allValuesFrom, unionOf, etc.) equivalent natural language patterns can be provided to the reader. However the OWL modeling elements Class, ObjectProperty, Individual, DatatypeProperty consist of an additional important feature which is necessary for their understanding: They must be labeled by the OWL designers themselves. Doing this, an OWL designer has the freedom to name these modeling elements as he likes. However this freedom can become a ‘‘nightmare” for further processing steps, including verbalization and interpretation. Starting from such individually constructed labels it is hard to re-construct the original intention of the designer. Therefore this kind of problem is still a ﬁeld of ongoing research. Of course there is always the possibility to additionally specify the right label within a speciﬁc tag taken from RDFS, e.g. Consumable thing for a OWL class

* Corresponding author. E-mail address: guenther.ﬂ[email protected] (G. Fliedl). 0169-023X/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.datak.2009.08.004

332

G. Fliedl et al. / Data & Knowledge Engineering 69 (2010) 331–342

ConsumableThing. Although this is provided by the OWL speciﬁcation such additional comments can hardly be found in OWL speciﬁcations. The reason is simple: People do not like to make additional comments. How can such a uncommented class be verbalized to natural language format? For this reason we (the NIBA workgroup1) decided to propose linguistically motivated OWL label structures by providing a list of naming guidelines. We investigated some of the underlying basic default patterns according to the usefulness for the development of labeling guidelines. Furthermore we evaluate the correctness of the correlation between used linguistic label patterns and the OWL internal representation. Finally we verbalize OWL labels according to our guidelines using grammar rules. This step improves the explicit readability within and beyond OWL classes, which in turn enhances the usability. The paper is structured in the following way: in Section 2 we give a short overview of the OWL concepts that are relevant for our argumentation. We also describe some problems concerning the diversity of class and property labeling methods used by the OWL community. In Section 3 we discuss existing verbalization strategies for ontologies, i.e. Attempto [5] and Swoop [6]. In Section 4 we describe our naming guidelines for OWL class and property labels and we present our four-step evaluation method for examining the correctness of OWL label coding. In Section 5 we discuss the NIBA approach for OWL verbalization, i.e. the ﬁltering of linguistic patterns, the development of labeling guidelines and the creation of Prolog-interpretable DCG (Deﬁnite Clause Grammar) rules for the creation of NL sentences encoding OWL concepts. Our paper concludes with an outlook on future work in Section 6.

2. OWL concepts relevant for labeling Because OWL is a W3C recommendation for the semantic web it has gained major importance in the previous years. OWL is application-oriented, e.g. it was developed mainly for the automatic processing of domain knowledge instead of preparing content for humans. As mentioned above, OWL can be seen as an extension of RDF Schemata using classes, properties and instances for application environments in the www. The OWL extension of RDF allows the speciﬁcation of restrictions like properties or cardinality constraints. Since the speciﬁcation of property labels is frequently based on class or subclass label names we shortly discuss the class labeling problem. Mainly we go into the speciﬁcation of OWL property labels and the related problems. 2.1. The OWL way of deﬁning classes and individuals Many default concepts in a given domain should correspond to classes, functioning as roots. Each OWL individual is automatically a member of the class owl:Thing and each user-deﬁned class is implicitly a subclass of owl:Thing. Domain speciﬁc root classes are deﬁned by simply declaring a named class. In the well-known wine-example-ontology2 the root classes are Winery, Region and ConsumableThing, which are deﬁned in the following way: As these class labels show, class-names in the wine ontology are rather simple and straight-forward. The only potentially problematic label name in these examples is ConsumableThing, since it consists of two words having been merged using the upper case as a delimiter strategy. However since no guidelines are available for creating these labels, other ontologies contain labels constructed with very different naming and delimiter strategies, which leads to an uncontrolled growth of labeling patterns, as can be seen in Table 1 [17]: The members of classes are called individuals. In the wine ontology speciﬁc wine grapes like CabernetSauvignonGrape would be an individual of the class WineGrape. This relationship is deﬁned in the following way: Likewise CentralCoastRegion is a speciﬁc member of the class Region and therefore an individual, which can be deﬁned as follows: 1 NIBA (German acronym for Natural Language Information Requirements Analysis) is a long term research project sponsored by the Klaus–TschiraFoundation in Heidelberg, Germany dealing with the extraction of conceptual models from natural language requirements texts. [3,4,10]. 2 This example were taken, respectively from http://www.w3.org/2001/sw/WebOnt/guide-src/wine.owl and http://www.w3.org/2001/sw/WebOnt/guidesrc/food.owl.

G. Fliedl et al. / Data & Knowledge Engineering 69 (2010) 331–342

333

Table 1 Typical class labels. ActionType Activate Activated-carbon-equipment Activated_p21cdc42Hs_Kinase Acute_Myeloid_Leukemia_in_Remission ADM-DIV-BARBADOS-PARISH AdministrativeStaffPerson Glycogen-Rich-Carcinoma

Another way of deﬁning this information is the following one: Obviously the same labeling strategies are available for classes and individuals. Looking at the examples on the OWL Standard web page, arbitrarily merged multi-terms with upper case delimiters are preferred. The internal semantics of the terms and the sub-terms is not discussed any further.

2.2. The OWL way of deﬁning relational object properties In OWL classes and individuals are extended via property deﬁnitions (see OWL deﬁnition [11]). In OWL a property definition can consist of the deﬁnitions of a domain and a range, which can be seen as restrictions. The concepts ObjectProperty, rdfs:domain, rdfs:range are used for deﬁning properties of objects which can be described as relations between classes. For the deﬁnition of object properties we again make use of the wine ontology: The three lines above are related to each other with an implicit conjunction operator: ‘‘The object property madeFromGrape has domain Wine and a range WineGrape”. By referring to the property deﬁnition within the class deﬁnition one has the ability to expand the deﬁnition of Wine, e.g. including the cardinality information that any wine is made from at least one grape (WineGrape). As in property deﬁnitions, class deﬁnitions include multiple subparts that are implicitly conjoined. 1 ...

Table 2 Typical property labels. BaseOnBalls BasePublicationURI Behind_Generality Concessive-RST HasProduct HasDiameter_of_size

334

G. Fliedl et al. / Data & Knowledge Engineering 69 (2010) 331–342

Table 2 contains parts of a list of property labels that are provided by the community [17]. As you can see the main problem of the listed property labels is that the internal structure of the various entries does not allow any kind of systematic interpretation concerning the relation between the sub-terms and their function.

3. Related work Since ‘‘formal” ontology representations of knowledge lack easy interpretability and traceability for humans, certain methods for verbalization of ontologies like OWL have been developed. In the following we very brieﬂy describe two approaches to OWL verbalization and their shortcomings which pose the motivation for our own guideline based approach that we will describe in Sections 4 and 5. To our knowledge two approaches for verbalizing OWL ontologies currently exist: Attempto Controlled English (ACE) [5] and Swoop [6]. Furthermore there are many software systems which graphically present ontologies (e.g. [2,6,7,12,14]) instead of verbalizing them. ACE is a subset of the English language. It uses reduced (controlled) patterns for ‘‘verbalizing” OWL ontologies. These translations are more easily interpretable by human readers and can always be translated back to the ontology representation. The Attempto approach uses its own grammar rules for constructing simple sentences3 which do not allow ambiguities, sentence gaps and any sorts of fuzziness, as the verbalization procedured below shows. An arbitrary property deﬁnition in the OWL-wine-ontology and its usage with min cardinalities between the domain ‘‘Wine” and the range ‘‘WineGrape” - 1 results in the following ACE translation:

Every Wine madeFromGrapes at least 1 things. Swoop on the other hand is an ontology engineering toolkit, which implements an algorithm for translating OWL ontologies to NL patterns by using some standard NL techniques. This can be seen as an extension to ACE for the labeling problem. Swoop uses general linguistic category symbols like V, NP, VP, etc. for a shallow analysis of OWL labels and it proposes a ﬁxed set of expansion rules for the linguistic patterns. See for example [6]: (has) NP – Examples: email, hasColor – Expansions: X has a color Y – Alternate (if Y is an AdjP): X has Y color

The Swoop engine uses a Part-Of-Speech Tagger for automatically detecting linguistic categories of words and generating corresponding NL sentences. This strategy does not allow a sufﬁcient solution of the ambiguity problem. Therefore the authors of Swoop propose simple disambiguation strategies like giving priorities to verbal forms, which presuppose currently non-existing ontology guidelines. Verbalization is also a topic in conceptual modeling. In [18] an automated verbalization approach for ORM is described. Also here the authors focus on the syntactical and semantic features of the model and how to construct semantically equivalent natural language constructs. Up to our knowledge also this approach is based on the assumption, that the designer deﬁnes a notion (e.g. an Object role) already in a natural language way (i.e. was born in instead of bornIn or wasBorn, etc.) 3

see the Attempto OWL verbalizer web interface at http://attempto.iﬁ.unizh.ch/site/docs/verbalizing_owl_in_controlled_english.html.

335

G. Fliedl et al. / Data & Knowledge Engineering 69 (2010) 331–342

Hence we conclude: All existing verbalization approaches focus on the very import aspect to (re)construct a good equivalent natural language verbalizations from semantically well deﬁned modeling constructs. However they depend on the willingness of the designer to label certain notions (e.g. object, object roles, etc.) correctly. In other words, they presuppose that the user-deﬁned labels are well-formed. In our own approach we speciﬁcally deal with the structure of these user-deﬁned labels. In particular we concentrate on OWL labels. We establish guidelines for such labels taking into account that this is the basis for any kind of verbalization and we show how to evaluate and verbalize them. In the following sections we describe our approach in detail. 4. A linguistic way of solving the OWL labeling problem As can be seen in the previous section, no consistent labeling strategies exist for encoding OWL class and object property labels. This makes the manual and automated interpretation and further processing of these labels not really easy. Our aim is to provide naming guidelines for label generation in order to deﬁne a framework for systematic ontology engineering. After proposing linguistically motivated naming guidelines which could help to optimize the ontology creation process, we identiﬁed relevant linguistic patterns. In our opinion these patterns will help to facilitate further processing steps. 4.1. Naming guidelines for OWL labels We discovered that utilizable OWL labels are mainly created following style guides of programming languages implicitly. In Computer Science programming languages or models have unambiguous syntax and additional explicit guidelines for using them are common. Adhering to these guidelines leads to models and programs that can be much easier interpreted. As an example of such guidelines see for instance [1,8,9,13] which use the Pascal and Camel Notation for Classes and Methods, i.e. the ﬁrst letter of each word in class and method names is uppercase, while the rest of the letters are lowercase. Upper case letters are thus used as delimiters in class and method names. We claim that the deﬁnition and use of naming guidelines should be extended to ontology engineering. Our special concern is the linguistically motivated setting of labels which is not restricted at all in OWL. Table 3 lists general guidelines for deﬁning OWL class labels, individual labels and object property labels that should be always followed. Table 4 lists additional guidelines that provide a mapping from a label name to the different label types in OWL. We also give examples for each guideline. The guidelines are based on the idea that general linguistic concepts should be used extensively when creating class and object property labels. These guidelines support the interpretation of the labels and also help optimizing the further processing steps. The listed guidelines are mostly self explanatory; furthermore we presuppose that the use of linguistic categories (Verb, Noun, Participle, etc.) is commonly known, thus we do not explain in detail here. All of the guidelines aim to facilitate the interpretation of OWL class and property labels. Guideline A2 for instance is necessary since a lot of different delimiters are available between terms in a compound concept name. Possible delimiters are spaces, underlines, special characters, etc. Regarding delimiters we follow the example of other style guidelines, like the naming conventions from the Java Code Conventions [13], where uppercase characters are used as delimiters between terms: in Java Pascal notations are proposed for class and interface names and Camel notations are proposed for variable or method names. Also in [13] it is proposed that abbreviations and acronyms should be avoided, unless the acronym/abbreviation is more common than the full word (e.g. HTML). These proposals are implemented as guidelines A3 and A4 in our list. Our guidelines in Table 4 are speciﬁc to ontology labels and propose default linguistic structures for class, property and individual label names. Guideline B5 describes labels starting with participle verbs. When participles occur in property labels, they decode passive form and thus are verbalized as passive constructions by default. The reason is that ontologies usually represent facts as they are and not as they were in the past.

Table 3 Normative naming guidelines for OWL labels. No.

Guideline

Example

A1 A2 A3

All labels should be written in English. If a label consists of more than one term, a deﬁnite delimiter between the terms must be used. Here we follow the guideline that an upper case character works as delimiter (Pascal Notation or CamelNotation). Abbreviations must be expanded (e.g. instead of No. ? Number, instead of org ? Organization)

A4

Acronyms in a label should be written like normal nouns starting with an upper case letter.

A5

Singular forms should be used on nouns

A6 A7

Property Labels should always start with lower case Class and Individual labels should always start with upper case

producesWine VintageYear, hasIntrinsicPattern calculateNumber, Organization FpsSeason, hasHtmlSource, statusGui Winery, PizzaTopping, hasBrightColor madeFromGrape, ownsCar PizzaTopping, Loire, WhiteLoire

336

G. Fliedl et al. / Data & Knowledge Engineering 69 (2010) 331–342

Table 4 Mapping guidelines for labels to OWL types. No.

Guideline

Example

Mapping to (C)lass/ (I)ndividual/ (P)roperty

B1

If labels are speciﬁed by either (compound) nouns, [adjective] + noun URLs, Acronyms, Or (compound) proper names they are mapped to Classes or Individuals If labels start with ‘‘has” and have the form ‘‘has” [+Adjective] + Noun they are mapped to properties If labels start with ‘‘is” and have the form ‘‘is” [+Adjective j Participle j Noun] [+Preposition] [+Noun] they are mapped to Properties If labels start with a verb in 3rd person and have the form Verb [+Preposition ] + Noun [+Preposition] they are mapped to Properties

Grape, WineGrape, http://zorlando.drc.com/daml/ontology/Glossary/ current/intensionalDeﬁnition, CabernetSauvignonGrape, Html

C, I

HasBrightColor

P

isLocatedIn, isBrotherOfPerson

P

ownsCar, producesWine, sendsTo, receivesFrom, sendsToRecipient, sendsLetterTo madeFromGrape

P

B2 B3

B4

B5

If labels start with a Participle Verb and have the form Participle verb + ‘‘From” j ‘‘By” [+Noun] they are mapped to Properties

P

4.2. Levels of linguistically based OWL label evaluation Presuming the plausibility of the above listed naming guidelines for labeling, we propose a four level mechanism for incrementally evaluating labels: (1) (2) (3) (4)

ensure that the guidelines A1–A7 are adhered to by the labels, assignment of linguistic categories to used labeling segments, mapping of the used labeling segments to possible labeling types, checking the mapping result against the respective OWL concept type (Class, Object Property).

Table 5 shows the evaluation results of OWL labels according to step 1 of our method. Labels 1–9 correspond to our guidelines and therefore pass the ﬁrst ﬁlter of evaluation process. The evaluation of these labels continues as can be seen in Table 6. Labels 10–14 fail our passing criteria however. The labels in Table 6 are taken from the in the wine and food OWL example ontologies. As can be seen most of the class, individual and property labels are compatible to our guidelines, besides examples 7, 8 and 9. For the data property label ‘‘yearValue” in example 7 we notice a contradiction because compound nouns are mapped to classes and individuals according to our framework. Likewise example 8 shows a property label that is in conﬂict with our guidelines, because it consists of the isolated noun ‘‘course”, which according to guideline B1 would indicate a class or individual label. The label ‘‘adjacentRegion” from example 9 consists of an adjective followed by a noun and this linguistic pattern also maps to classes or individuals according to guideline B1. The mapping of all three labels results in a contradictory situation: the linguistically expected label type does not match the label type which is actually represented in OWL. Our labeling guidelines B2–B5 enforce that a property must either start with ‘‘has”, ‘‘is”, Verb in 3rd Form Singular or a Participle Verb. Therefore the object property ‘‘course” should for instance be labeled either as ‘‘hasCourse” or ‘‘isCourse”. Based on additional OWL knowledge about the domain and the range of the object property (domain: ‘‘Meal”, Range: ‘‘MealCourse”) the label can be expanded to ‘‘hasMealCourse” and ‘‘isCourseOfMealCourse”, respectively. Hence it is possible to generate two label interpretations from the property structure, from which two syntactically well-formed sentences can be derived. The result can be seen in Table 7. However the pattern ‘‘isCourseOfMealCourse” is not a valid semantic possibility, since ‘‘course” is not a relational noun (compare for instance with the noun ‘‘brother” from the example ‘‘isBrotherOfPerson” in guideline B3 in Table 4 or with the noun ‘‘father (of)”). We propose to change the label name to ‘‘hasMealCourse” in this case. Therefore we establish the following rule: A non relational noun X prohibits the use of ‘‘is X of” in property label names. Thus, if a noun does not imply the preposition ‘‘of” in its context, e.g. ‘‘course” or ‘‘region”, then it should be combined with ‘‘has” as a property label. Following this rule, an automatic verbalization of cases, where the linguistically expected label type does not match the internal OWL label type is possible as will be shown in the next section.

G. Fliedl et al. / Data & Knowledge Engineering 69 (2010) 331–342

337

Table 5 Exemplary evaluation of OWL labels. No.

OWL Example

Evaluation step 1: ensure that guidelines A1–A7 are fulﬁlled

1

...

Guidelines A1–A7 fulﬁlled

2

...

Guidelines A1–A7 fulﬁlled

3

...

Guidelines A1–A7 fulﬁlled

4

...

Guidelines A1–A7 fulﬁlled

5

...

Guidelines A1–A7 fulﬁlled

6

...

Guidelines A1–A7 fulﬁlled

7

...

Guidelines A1–A7 fulﬁlled

8

...

Guidelines A1–A7 fulﬁlled

9

...

Guidelines A1–A7 fulﬁlled

10

...

Guidelines A1, A2, A7 not fulﬁlled

11

...

Guidelines A6 not fulﬁlled

12

...

Guidelines A4 not fulﬁlled

13

...

Guidelines A2 not fulﬁlled

14

...

Guidelines A2, A4 not fulﬁlled

Nevertheless, non relational nouns can be converted to relational labels by adjoining a relational adjective (e.g. ‘‘region” vs. ‘‘adjacent region” – ‘‘of”, see example 9 in Table 6). If a label contains relational nouns or noun phrases, both patterns can be applied.

5. Verbalization of guideline based OWL labels Besides the evaluation of OWL labels, our approach also contains a step, where OWL labels are verbalized for semantic validation through the human reader of natural language sentences. We propose a step-by-step approach for a linguistically based and elaborated ontology verbalization. The approach presupposes the labeling style guidelines and the linguistic patterns of OWL labels deﬁned in Section 4. The generation of natural language patterns for OWL classes and properties is based on the NTMS4-Paradigma and DCG rules, which have been developed during the early stages of the NIBA-Project. The proposed grammar uses NTMS-category labels like v3ð¼ sentencenodeÞ, n3ð¼ nominalphraseÞ, a0ð¼ adjectiveÞ, n0ð¼ nounÞ, aux0ð¼ auxiliaryÞ, v0ð¼ verbÞ, < pass > ð¼ passivationÞ and < tvag2 > ð¼ transitive; agentiveverbÞ and 4

Acronym for Natural Theoretic Morphosyntax [3].

338

G. Fliedl et al. / Data & Knowledge Engineering 69 (2010) 331–342

Table 6 Exemplary evaluation of OWL labels. No.

OWL Example

Evaluation step 2: assignment of linguistic categories

Evaluation step 3: Mapping to (C)lass/ (I)ndividual/ (P)roperty

Evaluation step 4: Checking of isolated labels

1

Is + verbal) Participle + Preposition + Noun

Guideline B3 fulﬁlled ? label type P expected

ok

Participle + Preposition + Noun

Guideline B5 fulﬁlled ? label type P expected

ok

Has + Adjective + Noun

Guideline B2 fulﬁlled ? label type P expected

ok

+Noun

Guideline B4 fulﬁlled ? label type P expected

ok

Adjective + Adjective + Noun

Guideline B1 fulﬁlled ? label type C or I expected

ok

Compound Noun

Guideline B1 fulﬁlled ? label type C or I expected

ok

has + Noun

Guideline B1 fulﬁlled ? label C or I expected

Contradiction (Type ‘‘Datatype Property” instead of ‘‘Class” or ‘‘Individual”)

Noun

Guideline B1 fulﬁlled ? label type C or I expected

Contradiction (Type ‘‘Object Property” instead of ‘‘Class” or ‘‘Individual”)

Adjective + Noun

Guideline B1 fulﬁlled ? label type C or I expected

Contradiction (Type ‘‘Object Property” instead of ‘‘Class” or ‘‘Individual”)

... 2

... 3

... 4

...

5

... 6

... 7

... 8

... 9

-

...

Table 7 Natural language interpretation of OWL labels. OWL-near Interpretation

Min. Cardinality according to OWL property structure

Verbalized sentence

‘‘Meal has Meal Course” ‘‘Meal is course of MealCourse”

1 1

‘‘Each meal has at least one meal course”. ‘‘Each meal is a course of at least one meal course”

ppð¼ pastparticipleÞ. It produces binary trees containing categorical and lexical nodes, which are identical with natural language words. Relationships between class labels and property labels are transformed to simple sentences. The following two examples show the verbalization result for two OWL labels ‘‘isLocatedIn” and ‘‘madeFromGrape”, where the naming guidelines A1–A7 are fulﬁlled and the guidelines B1–B5 show that these OWL labels match the expected label types. Using DCG rules the automatic verbalization of these labels is possible, as can be seen in Figs. 1 and 2.

G. Fliedl et al. / Data & Knowledge Engineering 69 (2010) 331–342

339

Fig. 1. First parse tree.

Fig. 2. Second parse tree.

The following extract of the OWL-wine-ontology has been transformed to linguistic objects using the DCG rules underneath, presupposing the fact that labels inside the XML representation can easily be cut out according to our labeling guidelines. The concept fragments underneath . . .. . .

340

G. Fliedl et al. / Data & Knowledge Engineering 69 (2010) 331–342

Fig. 3. Third parse tree.

are transformed to a set of parser rules. These parser rules produce sentences and syntactically relevant phrase nodes which can be enriched with attributes like class, locð¼ locationÞ; locVð¼ locativeverbÞ; pass; tvag2, etc. The ﬁrst two output examples are both represented in bracketing and graphical tree format, allowing a better visualization of the encoded grammatical structure: 0

v3ðv2ðn3 classðspz0ð½0 The Þ; n2ðn0ð½0 Sauterne0 ÞÞÞ; v2ðv1ðv0ðaux0 locð½isÞ; 0

v0 locVð½locatedÞÞ; p2ðp0ð½inÞ; n3ðspz0ð½theÞ; n0 locð½0 Region Þ; 0

0

0

0

n0 valueð½ Sauterne ÞÞÞÞÞÞÞv3ðv2ðn3 classðspz0ð½ The Þ; n2ða2ða0ð½whiteÞÞ; 0

n0ð½0 Loire ÞÞÞ; v2ðv1ðv0ðaux0 passð½isÞ; v0 tvag2ð½madeÞÞ; p2ðp0ð½fromÞ; n3ðn0 sourceð½0 Grape0 ÞÞÞÞÞÞÞ The OWL concept is linguistically represented as a tree structure, which is used as a method for mapping conceptual relationships to a linguistically determined linearization frame of label-internal terms. The parse tree in Fig. 3 shows one possible verbalization of the OWL object property label ‘‘course”. As we argued in Section 4 this label is contradictory (a class or individual label is expected, bit the actual OWL label type is a property label). The two possible interpretations ‘‘Each meal has at least one meal course” or ‘‘Each meal is a course of (with) at least one meal course” exist, but according to the additional rule that we provided in Section 4.2 the parser automatically produces the ﬁrst interpretation. Below see the OWL concept fragment for this example: ... - 1 The bracketing structure underneath is the result the application of the parsing rules (see Fig. 3). 0

v3ðv2ðn3 classðn2ðq2ðq0ð½0 Each ÞÞ; n0ð½mealÞÞÞ; v2ðv1ðv0ð½hasÞ; 0

n3ðn2ðq2ðpt0ð½0 atleast Þ; q0ð½oneÞÞ; n0ðn0ð½mealÞ; n0ð½courseÞÞÞÞÞÞÞÞ 6. Conclusion and future work Our approach presupposes a systematic deﬁnition of OWL labels based on linguistic patterns and labeling style guidelines. We also presented a four-step method for OWL label evaluation based on these guidelines. Further semantic validation

G. Fliedl et al. / Data & Knowledge Engineering 69 (2010) 331–342

341

of labels is possible through transforming OWL concepts to NL patterns by using elaborated DCG parsing techniques, which allow automatic transformation and splitting of OWL labels into natural language patterns. Future work should include a systematic way of deﬁning grouping rules for sentence blocks and a ﬁner-granulated set of naming guidelines for OWL label generation. For the lexicalization of revised OWL class and property labels we plan to use the Klagenfurt Conceptual Predesign Model which would allow a glossary-representation of the cleared and split up labeling contents and the evaluated labels. Acknowledgment The authors also would like to thank all the students involved for their substantial implementation work. References [1] S.W. Amber: UML 2 Class diagram Guidelines . [2] S. Bechhofer, I. Horrocks, C. Goble, R. Stevens, OilEd: a reasonable ontology editor for the semantic web, in: Proceedings of KI2001, Joint German/ Austrian conference on Artiﬁcial Intelligence, LNAI vol. 2174, Springer-Verlag, Vienna, September 19–21, 2001, pp. 396–408. [3] G. Fliedl, Natürlichkeitstheoretische Morphosyntax – Aspekte der Theorie und Implementierung, Gunter Narr Verlag, Tübingen, 1999. [4] G. Fliedl, Ch. Kop, H.C. Mayr, Ch. Winkler, M. Hölbling, T. Horn, G. Weber, Extended tagging and interpretation tools for mapping requirements texts to conceptual (predesign) models, in: Andrés Montoyo, Rafael Munoz, Elisabeth Métais, 10th International Conference on Applications of Natural Language to Information Systems, NLDB 2005, Alicante Spain, Lecture Notes in Computer Science, LNCS 3513, Springer-Verlag, 2005, pp. 173–180. [5] N.E. Fuchs, S. Höﬂer, K. Kaljurand, F. Rinaldi, G. Schneider, Attempto controlled english: a knowledge representation language readable by humans and machines, in: N. Norbert Eisinger, J. Maluszynski (Eds.), Reasoning Web First International Summer School 2005, LNCS, vol. 3564, Springer, 2005, pp. 213–250. [6] D. Hewlett, A. Kalyanpur, V. Kolovski, C. Halaschek-Wiener, Effective natural language paraphrasing of ontologies on the semantic web. End user semantic web interaction workshop, International Semantic Web Conference (ISWC), November 2005, Galway, Ireland. [7] A. Kalyanpur, B. Parsia, E. Sirin, B. Cuenca-Grau, J. Hendler, Swoop: a ’web’ ontology editing browser, Journal of Web Semantics 4 (2) (2005) 144–153. [8] F. Kristiansen, PHP Coding Standard . [9] M. Krüger, C# Coding Style Guide . [10] H.C. Mayr, Ch. Kop: A user centered approach to requirements modeling, in: Proceedings of the ‘‘Modellierung’2002”, GI-Edition LNI P-12, 2002, pp. 75–86. [11] D.L. McGuinness, F. van Harmelen, OWL Web ontology language overview, . [12] N. Noy, M. Sintek, S. Decker, M. Crubezy, R. Fergerson, M. Musen, Creating semantic web contents with Protege-2000, IEEE Intelligent Systems (2001) 60–71. [13] Sun Microsystems, Code Conventions for the JavaTM Programming Language, 1999 . [14] Y. Sure, M. Erdmann, J. Angele, S. Staab, R. Studer, D. Wenke: OntoEdit: collaborative ontology engineering for the semantic Web, in: I. Horrocks, J. Hendler (Eds.), Proceedings of the First International Semantic Web Conference 2002 (ISWC 2002), LNCS, vol. 2342, Springer, 2002, pp. 221–235. [15] Resource Description Framework, . [16] Extensible Markup Language (XML), . [17] DAML Ontology Library, . [18] T. Halpin, M. Curland, Automated verablization for ORM 2, in: Proceedings, OTM 2006 Workshops – On the Move to Meaningful Internet Systems 2006, Lecture Notes in Computer Science (LNCS 4278), Springer-Verlag, Berlin Heidelberg, 2006, pp. 1181–1190.

Günther Fliedl is Associated Professor for Computational Linguistics at the Department of Applied Informatics (AINF) in Klagenfurt University. His major research interests are Natural Language Parsing, Lexicon development and implementation of tagging functionality as well as linguistic aspects of verbalization. His post graduate studies brought to light the ‘‘Natural Theoretic Morphosyntax” (NTMS).

Christian Kop studied Applied Computer Science at the University of Klagenfurt. He works as an research at the Department of Applied Informatics at this University. In 2002 he received his Ph.D. His research interests includes information systems analysis and design, and questions of natural language processing support for information systems analysis and ontologies.

342

G. Fliedl et al. / Data & Knowledge Engineering 69 (2010) 331–342

Jürgen Vöhringer completed his degree in Applied Informatics at the Klagenfurt University in 2003. Since August 2004 he works as an assistant at the Institute for Business Informatics and Application Systems. His research interests include information systems analysis and design, integration of conceptual models and ontologies.

Guideline based evaluation and verbalization of OWL class and property labels

Guideline based evaluation and verbalization of OWL class and property labels

Recommend Documents