Available online at www.sciencedirect.com
ScienceDirect Procedia - Social and Behavioral Sciences 186 (2015) 805 – 810
5th World Conference on Learning, Teaching and Educational Leadership, WCLTA 2014
Representation of Universal Quantifier in Bulgarian Language with Universal Networking Language Velislava Stoykovaa* a
Institute for Bulgarian Language, 52, Shipchensky proh. str., bl. 17, 1113 Sofia, Bulgaria
Abstract The paper presents interpretation of universal quantifier in Bulgarian language with Universal Networking Language (UNL). It analyzes semantic and linguistic properties of universal quantifier and describes the parts of speech which function as quantifiers. In UNL frameworks they are presented as a lexical database with related grammar and semantic features using idea of synonymic semantic representation. The formal approach presented may be used also to language learning and understanding. © 2015 The TheAuthors. Authors.Published Published Elsevier © 2015 by by Elsevier Ltd.Ltd. This is an open access article under the CC BY-NC-ND license Peer-review under responsibility of Academic World Education and Research Center. (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of Academic World Education and Research Center Keywords: Natutal Language Semantics, Quantifiers, Semantic Networks, Language Learning and Understanding;
1. Introduction Quantifiers are interpreted in mathematical, logical and linguistic frameworks. In mathematical sense quantifiers are counting and measuring by classifying objects into some set of numbers. In their logical interpretation quantifiers are operators which bind a variable over a domain of discourse. The linguistic interpretation of quantifiers defines them as types of determiners (words such as all or many) that indicate a quantity. Generally, they correspond to logical or mathematical quantifiers and at the semantic level they express the same semantics in natural languages.
* Velislava Stoykova. Tel.: + 359 2 979 2953; fax: + 359 2 9 70 23 02. E-mail address:
[email protected]
1877-0428 © 2015 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of Academic World Education and Research Center doi:10.1016/j.sbspro.2015.04.175
806
Velislava Stoykova / Procedia - Social and Behavioral Sciences 186 (2015) 805 – 810
2. Natural Language Quantification All known natural languages are using quantifiers (Wiese, 2003). In natural languages quantifiers are mostly adjectives, pronouns, determiners depending on their specific syntactic position. Generally, the syntactic position of quantifiers is as modifiers or specifiers. In fact, the semantic meaning of quantifiers is more difficult to reformulate in natural language expressions and the task requires the use of rather complicated syntactic structures. Formal interpretation of quantifiers is important Natural Language Processing task since it requires adequate formal semantic representation of both grammar (syntactic) structures and lexical resources. However, the formal interpretation of quantification plays an important role in both formal and natural language representation. In fact, quantifiers are interpreted in both logical semantic universal formal representation and natural language semantic representation. Further, we are going to analyze a formal representation of the universal quantifier for Bulgarian language using Universal Networking Language frameworks and mainly, its lexical database representation. 3. Universal Networking Language In the UNL approach, information conveyed by natural language is represented as a hyper-graph composed of a set of directed binary labeled links (referred to as relations) between nodes or hyper-nodes (the Universal Words (UWs)), which stand for concepts (Uchida et al., 2005). Universal Words represent universal concepts and correspond to the nodes - to be interlinked by relations or modified by attributes – in a UNL graph. They can be associated to natural language open lexical categories (noun, verb, adjective and adverb). Additionally, UWs are organized in a hierarchy (the UNL Ontology), are defined in the UNL Knowledge Base, and exemplified in the UNL Example Base, which are the lexical database for UNL. As language-independent semantic units, UWs are equivalent to the sets of synonyms in a given language, approaching the concept of a synset used by the WordNet. Attributes are arcs linking node onto itself. They correspond to one-place predicates, i.e., functions that take a single argument. In UNL, attributes have been normally used to represent information conveyed by natural language grammatical categories (such as tense, mood, aspect, number, etc.). Attributes are annotations made to nodes or hyper-nodes of a UNL hyper-graph. They denote the circumstances under which these nodes (or hyper-nodes) are used. Attributes may convey three different kinds of information: (i) The information on the role of the node in the UNL graph, (ii) The information conveyed by bound morphemes and closed classes, such as affixes (gender, number, tense, aspect, mood, voice, etc), determiners (articles and demonstratives), etc., (iii) The information on the (external) context of the utterance. Attributes represent information that cannot be conveyed by UWs and relations. Relations are labeled arcs connecting a node to another node in a UNL graph. They correspond to two-place semantic predicates holding between two UWs and define semantic roles. In UNL, relations have been normally used to represent semantic cases or thematic roles (such as agent, object, instrument, etc.) between UWs. UNL-NL Grammars are sets of rules for translating UNL expressions into natural language (NL) sentences and vice-versa. They are normally unidirectional, i. e. the en-conversion grammar (NL-to-UNL) or de-conversion grammar (UNL-to-NL), even though they share the same basic syntax. In the UNL Grammar there are two basic types of rules: (i) Transformation rules – used to generate natural language sentences out of UNL graphs and viceversa and (ii) Disambiguation rules – used to improve the performance of transformation rules by constraining their applicability. The UNL offers a universal language-independent and open-source platform for multilingual applications (Boitet et al., 2007). The UNL application for English language is available but some applications for other languages like Russian (Boguslavsky, 2005) and Bulgarian (Noncheva et al, 2011; Stoykova, 2012) are available as well. 4. The UNL Representation of Universal Quantifier in Bulgarian Language The universal quantification in Bulgarian language can be expressed with pronouns, adverbs and adjectives. However, the syntactic position also is important with respect to the semantics of universal quantification. Thus, the
Velislava Stoykova / Procedia - Social and Behavioral Sciences 186 (2015) 805 – 810
807
position of modifiers, specifiers or adjunct adverbial positions are typical for universal quantifiers in Bulgarian language. 3.1. The Semantics of Universal Quantifier The semantics of universal quantifier in Bulgarian language express generalization. It indicates the full number of amount of, or every one of a class. However, there exist additional semantically significant markers which denote more complicated semantic features and outline two general groups within. The first group indicates semantics of generalization by considering every one member of a set individually. The second group express semantics of generalization by considering each and all members of a set as a whole. The grouping is bounded by both lexical distribution and grammar features relations. The UNL representation of universal quantifier will take into account that grouping by indicating their specific semantic differences. 3.2. The Grammar of Universal Quantifier As it has been already pointed Bulgarian language use as universal quantifier pronouns, adverbials and adjectives. Instead of their complicated syntactic position, pronouns and adjectives have also the grammar features of gender and number. So, the representation of universal quantifier in UNL frameworks needs to develop both syntactic grammar rules and inflectional morphology grammar rules (Stoykova, 2013;Stoykova, 2015). 3.3. The UNL Representation of Universal Quantifier The representation of universal quantifier in UNL frameworks includes representation of related pronouns, adverbials and adjectives as part of the UNL lexical database and as related to Universal Words. Their grammar features are represented as attributes (syntactic and morphological annotations) and their specific semantic features are represented as relations. The application also uses synonymic hierarchical representation to both syntactic and semantic structures indicating the relation of semantic equivalency at both levels. Thus, the Bulgarian pronoun всеки (every) is presented as part of the UNL lexical database according to its different semantic properties and translation. The lexical entry includes information about part of speech (POS=NPR), inflectional rules (which generate gender inflected forms according to inflectional rules of paradigm PAR=M180), syntactic grammar rules (as modifier), and semantic features (Fig. 1).
Fig. 1. The lexical entry of Bulgarian pronoun every (всеки) in UNL representation.
The singular gender form of that pronoun belongs to the first semantic group of universal quantifiers. So, it is presented (Fig. 2) assigning the attribute att=@each; which relates it to the English each (in sense ‘every one of’).
808
Velislava Stoykova / Procedia - Social and Behavioral Sciences 186 (2015) 805 – 810
Fig. 2. The lexical entry of Bulgarian pronoun each (всеки) in UNL representation.
If the pronoun is used independently, it assigns a person (Fig. 3), so it is given the attribute att=@every;@person; relating it to English everyone.
Fig. 3. The lexical entry of Bulgarian pronoun everyone (всеки) in UNL representation.
The pronoun всичко express semantics of generalization by unifying all elements of a given set and it belongs to the second group of universal quantifiers. The pronoun has only one form and it relates to English all. Its UNL representation (Fig. 4) includes also syntactic grammar rules (as modifier).
Fig. 4. The lexical entry of Bulgarian pronoun all (всичко) in UNL representation.
The pronoun всички also belongs to the second group of universal quantifiers and it has only one form. Its UNL representation (Fig. 5) includes syntactic grammar rules (as modifier) and a semantic attribute att=@all; relating it to English all and to Bulgarian всичко.
Velislava Stoykova / Procedia - Social and Behavioral Sciences 186 (2015) 805 – 810
809
Fig. 5. The lexical entry of Bulgarian pronoun all (всички) in UNL representation.
The pronoun всякакъв is represented in UNL frameworks (Fig. 6) as part of the lexical database by giving its inflectional grammar rules (generating related gender inflected forms) and by defining its syntactic grammar rules.
Fig. 6. The lexical entry of Bulgarian pronoun whatever (всякакъв) in UNL representation.
The adverbial всякъде (навсякъде) is represented in UNL frameworks as part of the lexical database (Fig. 7). The representation includes information for part of speech (POS=AAV), its Bulgarian synonym където и да е and related English synonyms wherever and wheresoever.
Fig. 7. The lexical entry of Bulgarian adverbial wherever (навсякакъде) in UNL representation.
The adjective цял (whole) also is used as universal quantifier. Its UNL representation (Fig. 8) includes information for part of speech (POS=ADJ), its Bulgarian synonym пълен (full) and related English synonyms whole, entire, full and total.
810
Velislava Stoykova / Procedia - Social and Behavioral Sciences 186 (2015) 805 – 810
Fig. 8. The lexical entry of Bulgarian adjective whole (цял) in UNL representation.
4. Conclusion The representation of universal quantifier in Bulgarian language with UNL uses the idea of synonymic semantic relations at logical and lexical semantic level. It interprets the parts of speech which function as universal quantifiers as belonging to UNL lexical database and presents their related grammar features (syntactic and morphological rules) and their semantic properties (assigning attributes and lexical synonyms). The interpretation uses web-based UNL platform and allows sentence processing both with respect to natural language analysis and natural language generation tasks. The interpretation allows further development and improvement by adoption of new grammar rules. With its universal semantic representation it allows also machine translation applications for related and non-related languages. References Boguslavsky, I. (2005). Some Lexical Issues of UNL. In: J. Cardenosa, A. Gelbukh and E. Tovar, Eds. Universal Network Language: Advances in Theory and Applications, Research on Computing Science, vol. 12, pp. 101–108. Boitet, C., Boguslavskij, I. and Cardenosa, J. (2007). An Evaluation of UNL Usability for High Quality Multilingualisation and Projections for a Future UNL++ Language. In A. Gelbukh, Ed., Proceedings of CICLing, Lecture Notes in Computer Sciences, vol. 4394, pp. 361-373. Noncheva, V., Stancheva, Y. and Stoykova, V. (2011). The Little Prince Project - Encoding of Bulgarian Grammar. Tech. rep., www.undl.org, UNDL Foundation. Stoykova, V. (2012). Bulgarian Inflectional Morphology in Universal Networking Nanguage. In: M. Kay and C. Boitet, Eds. Proceedings of 24th International Conference on Computational Linguistics (COLING 2012), Demonstration Papers, pp. 423–430, ACL Anthology Stoykova, V. (2013). Representation of Possessive Pronouns with Universal Networking Language, In: L. Iliadis, H. Papadopoulos and C. Jayne, Eds. Proceedings from EANN 2013, Part II, Communications in Computer and Information Science (CCIS), vol. 384, pp. 129–137, SpringerVerlag Berlin Heidelberg. Stoykova, V. (2015). Representation of Bulgarian Pronouns with Universal Networking Language, University of Sofia Publishing House, (in Bulgarian). Uchida, H., Zhu, M. and Della Senta, T. (2005). Universal Networking Language, UNDL Foundation. Wiese, H. (2003). Numbers, Language and the Human Mind, Oxford University Press.