Data & Knowledge Engineering 58 (2006) 90–106 www.elsevier.com/locate/datak
Interpreting semi-formal utterances in dialogs about mathematical proofs Helmut Horacek
a,*
, Magdalena Wolska
b
a
b
Fachrichtung Informatik, Universita¨t des Saarlandes, Postfach 15 11 50, D-66041 Saarbru¨cken, Germany Fachrichtung Allgemeine Linguistik, Universita¨t des Saarlandes, Postfach 15 11 50, D-66041 Saarbru¨cken, Germany Received 19 May 2005; accepted 19 May 2005 Available online 29 June 2005
Abstract Dialogs in formal domains, such as mathematics, are characterized by a mixture of telegraphic natural language text and embedded formal expressions. Analysis methods for this kind of setting are rare and require empirical justification due to a notorious lack of data, as opposed to the richness of presentations found in genre-specific textbooks. In this paper, we focus on interpretation techniques for major phenomena observed in a recently collected corpus of tutorial dialogs on proving mathematical theorems. We combine analysis techniques for mathematical formulas and for natural language expressions, supported by knowledge about domain-relevant lexical semantics and by representations relating informal vocabulary to precise domain terms. Interpreting these expressions in a competent manner is not only important for the use in tutorial systems, but also for supporting domain experts through improving the accessibility and usability of formal systems. 2005 Elsevier B.V. All rights reserved. Keywords: Tutorial dialog; Knowledge representation
*
Corresponding author. E-mail addresses:
[email protected] (H. Horacek),
[email protected] (M. Wolska).
0169-023X/$ - see front matter 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.datak.2005.05.010
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
91
1. Introduction Dialogs in formal domains, such as mathematics, are characterized by a mixture of telegraphic natural language text and embedded formal expressions. Acting adequately in these kinds of dialogs is specifically important for tutorial purposes where tutoring concerns a formal domain, including mathematics. Empirical findings show that flexible natural language dialog is needed to support active learning [16], and natural language has been argued as the preferred mode of interaction for intelligent tutoring systems [1]. To meet requirements of tutorial purposes, we aim at developing a tutoring system with strong natural language dialog capabilities to support interactive mathematical problem solving. In order to address this task in an empirically adequate manner, we have carried out a Wizard-of-Oz (WOz) study on tutorial dialogs on proving mathematical theorems. In this paper, we report on interpretation techniques we have developed for major phenomena related to lexical imprecision observed in this corpus. We combine analysis techniques for mathematical formulas and for natural language expressions, supported by knowledge about domain-relevant lexical semantics and by representations relating imprecise informal vocabulary to precise domain terms. Interpreting telegraphic utterances consisting of natural language and formula parts in a flexible manner is not only important for use in tutorial systems, but also for improving the accessibility and usability of formal systems in supporting domain experts. For example, a good deal of work is involved in improving the working environment of mathematicians by providing comfortable editing tools and access to proof checking and proof searching systems (see the initiative on mathematical knowledge management, [2]). Such an environment would be greatly enhanced by enabling mathematicians to communicate in their domain language that is characterized by formulas interwoven with natural language descriptions, as opposed to a variety of logical and technical specifications needed for todayÕs interfaces. The outline of this paper is as follows. We first present the environment in which this work is embedded, including a description of the WOz experiment. Next, we describe the link we have established between linguistic and domain knowledge sources. Then, we give details about interpretation methods for the phenomena observed in the corpus, and illustrate them with an example. Finally, we discuss future developments.
2. Our project environment Our investigations are part of the DIALOG project1 [6]. Its goal is (i) to empirically investigate the use of flexible natural language dialog in tutoring mathematics, and (ii) to develop an experimental prototype system gradually embodying the empirical findings. The experimental system will engage in a dialog in written natural language to help students understand and construct mathematical proofs. In contrast to most existing tutoring systems, we envision a modular design, making use of the powerful proof system XMEGA [19]. This design enables detailed reasoning about the
1 The DIALOG project is part of and supported by the Collaborative Research Center on Resource-Adaptive Cognitive Processes (SFB 378) at University of the Saarland [17].
92
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
studentÕs action and bears the potential of elaborate system responses. The scenario for the system is illustrated in Fig. 1: • Learning environment. Students take an interactive course in the relevant subfield of mathematics within the web-based system ACTIVEMATH [15]. • Mathematical proof assistant (MPA). Checks the appropriateness of user specified inference steps with respect to the problem-solving goal; based on XMEGA. • Proof manager (PM). In the course of the tutoring session the user may explore alternative proofs. PM builds and maintains a representation of constructed proofs and communicates with the MPA to evaluate the appropriateness of the userÕs dialog contributions for the proof construction. • Dialog manager. We employ the Information-State (IS) Update approach to dialog management developed in the TRINDI project [20]. • Knowledge resources. This includes pedagogical knowledge (teaching strategies) and mathematical knowledge (in our MBase system [13]).
DIALOG RESOURCES PROOF MANAGER
LINGUISTIC RESOURCES
M
U S O ER D EL
ANALYSIS
DIALOG MANAGER
L A IC E G G O D G E A L D W PE NO K
USER
GENERATION
We have conducted a WOz experiment [7] with a simulated system [9] in order to collect a corpus of tutorial dialogs in the naive set theory domain. Twenty-four subjects with varying educational background and prior mathematical knowledge ranging from little to fair participated in the experiment. The experiment consisted of three phases: (1) preparation and pre-test on paper, (2) tutoring session mediated by a WOz tool, and (3) post-test and evaluation questionnaire, on paper again. During the session, the subjects had to prove three theorems (K and P stand for set complement and power set, respectively): (i) K((A [ B) \ (C [ D)) = (K(A) \ K(B)) [ (K(C) \ K(D)); (ii) A \ B 2 P((A [ C) \ (B [ C)) and (iii) if A K(B), then B K(A). The interface
MATHEMATICAL KNOWLEDGE (MBASE)
LEARNING ENVIRONMENT ACTIVEMATH
Fig. 1. DIALOG project scenario.
MATHEMATICAL PROOF ASSISTANT OMEGA
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
93
enabled the subjects to type text and insert mathematical symbols by clicking on buttons. The subjects were instructed to enter steps of a proof rather than a complete proof at once, in order to encourage dialog interaction with the system. The tutor-wizardÕs task was to respond to the studentÕs utterances following a given algorithm [10].
3. Phenomena observed We have identified several kinds of phenomena that bear some particularities of the genre and the domain and have categorized them as follows (cf. Fig. 2): • Interleaving text with formula fragments. The role of formulas may be made explicit by natural language statements, such as giving a justification for an equivalence transformation, as in (1). Derivations of formulas may also be enhanced by natural language function words and connectives, to stress fluency and coherence of the operation descriptions (2) and (3). In some cases, natural language and formal statements may be tightly connected (4). The latter example poses specific analysis problems, since only a part (here: variable x) of a mathematical expression (here: x 2 B) lies within the scope of a natural language operator adjacent to it (here: negation). • Informal relations. Domain relations and concepts may be described imprecisely or ambiguously using informal natural language expressions that are typically shorter than the exact domain terminology. For example, ‘‘to be in’’ can be interpreted as ‘‘to be an element of’’, which is correct in (5), or as ‘‘to be a subset of’’, which is correct in (6). Similarly, ‘‘both sets together’’ in (7) can be interpreted as ‘‘the union of both sets’’ or ‘‘the intersection of both sets’’, which are both correct, but only the stronger interpretation, as a union, makes sense in the context. Moreover, common descriptions applicable to collections need to be interpreted in view of the application to their mathematical counterparts, sets, specifying relations among them precisely. For example, the expression ‘‘completely outside’’, (8), makes sure that not only some, but in fact all elements of set B do not belong to set A, while ‘‘completely different’’, (9), percolates the difference between sets to all pairs of elements of these sets. • Incompleteness. A challenge for the natural language analysis lies in the large number of unexpected synonyms, where some of them have a metonymic flavor. For example, ‘‘left side’’, (10), refers to a part of an equation, which is not mentioned explicitly in that utterance. Moreover, the expression ‘‘inner parenthesis’’, (11), requires a metonymic extension, referring to the expression enclosed by that pair of parentheses. Similarly, the term ‘‘complement’’, (12), does not refer to the operator per se, but to an expression identifiable by this operator, that is, where complement is the top-level operator in the expression referred to. As usual with metonymic expressions, a type clash triggers the interpretation involving an extension. • Operators. Semantically complex operators require a domain-specific interpretation. Specifically, this may be associated with ambiguity, such as the interpretation of ‘‘vice-versa’’ in (13). Applying the operator vice-versa to the expression ‘‘A in K(B)’’ may be interpreted as ‘‘K(B) in A’’, which is semantically implausible, or as ‘‘B in K(A)’’, which is the appropriate interpretation. Occasionally, natural language expressions used to refer to mathematical concepts deviate from the proper mathematical conception. For example, the truth of some axiom,
94
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
Fig. 2. Examples of dialog utterances (not necessarily correct in a mathematical sense). The predicates P and K stand for power set and complement, respectively.
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
95
when instantiated for an operator, might be expressed as a property of that operator in natural language, such as ‘‘symmetry’’ as a property of ‘‘set union’’ in (14). In the domain of mathematics, this situation is conceived as an axiom instantiation. The examples exposed in Fig. 2 are representative of our corpus and led us to the systematicity given in this figure. We have developed representation methods and interpretation techniques for most of the phenomena described; only handling the complex operators adequately is still on our agenda. As the phenomena observed in our study show, utterances mixing up natural language text with mathematical formula fragments as well as various appearances of informality and incompleteness occur frequently in conversations in this domain. We conjecture that these kinds of expressions are more pronounced for novices since they are generally unsure about the domain semantics and terminology, and their expressive skills are limited. However, personal communication with some mathematicians gave us the feeling that experts make similar use of expressive means as novices do in our study, preferring communicative ease and conciseness to a bureaucratic formal accuracy. A major difference between experts and novices lies in the frequency of mistakes which novices tend to make quite often, as the examples in Fig. 2 illustrate, although this varies very much from subject to subject. These discrepancies led us to employing personalized strategies in the tutorial environment, such as asking students to clarify even contextually interpretable ambiguities when assuming a limited competence on behalf of the subject [12].
4. Intermediate knowledge representation In order to process adequately utterances such as the ones discussed in the previous section, natural language analysis methods require access to domain knowledge. However, this imposes serious problems in our environment, due to the fundamental representation discrepancies between knowledge bases of deduction systems, such as our system XMEGA, and linguisticallymotivated knowledge bases (cf. [11]). The contribution of the intermediate knowledge representation explained in this section is to mediate between these two complementary views. In brief, XMEGAÕs knowledge base is organized as an inheritance network, and the representation is concentrated on the mathematical concepts per se. Their semantics is expressed in terms of typed lambda-calculus expressions which constitute precise and complete logical definitions required for proving purposes. Inheritance is merely used to percolate specifications efficiently, to avoid redundancy, and to ease maintenance, but there is no explicit hierarchical structuring imposed. For example, ‘‘set’’, a basic element in XMEGA, is represented as kx.Px, and ‘‘power set’’ is represented as "x.x 2 P(A) M x A. The specialization relation between these two is, however, not expressed explicitly; it is only derivable by analyzing type constraints and filler specifications. Moreover, ‘‘symmetry’’ is expressed as an axiom, where a specific operation R1 being symmetric is expressed as an instantiation of R to R1 in that axiom: "xy.xRy ! yRx, but properties of that operation are not made explicit. In contrast to finding proofs, meeting communicating purposes does not require access to complete logical definitions, but does require several pieces of information that go beyond what is represented in XMEGAÕs knowledge base. This includes:
96
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
• Hierarchically organized specialization of objects, together with their properties, and object categories for their fillers, enabling, e.g., type checking. • The representation of informal terms which need to be interpreted in domain-specific terms in the tutorial context. • Modeling of typographic features representing mathematical objects ‘‘physically’’, including markers and orderings, such as argument positions. In order to meet these requirements, we have built a representation that constitutes an enhanced mirror of the domain representations in XMEGA. It serves as an intermediate representation between the domain and the linguistic models. Domain objects and relations are reorganized in a specialization hierarchy in KL-ONE [8] style, and prominent aspects of their semantics are expressed as properties of these items, with constraints on the categories of their fillers. For example, the operator that appears in the definition of the symmetry axiom is re-expressed under the property central operator, to make it accessible to natural language references (see the lower right part of Fig. 3). Note that this is different from the XMEGA representation which completely expresses the semantics of ‘‘symmetry’’, but does not provide an access from the operator to the axiom. In the representation fragments in Figs. 3 and 4, objects and relations are referred to by names in capital letters, and their properties by names in lower-case. Filler constraints and specialization links between properties are not depicted. Properties are inherited in an orthogonal monotonic fashion. Moreover, a specialization of a domain object may introduce further properties, indicated by a leading Ô+Õ in the property name, or it may specialize properties introduced by more general objects (indicated by the term ÔspecÕ preceding the more specific property name). For example, the property container specified for CONTAINMENT is a specialization of the property argument introduced with RELATION and percolated via SEMANTIC RELATION (Fig. 4). In addition, value restrictions on the property fillers may be specified, which is indicated by the term ÔrestrÕ preceding the filler name, and an interval enclosed in parentheses expressing number restrictions. For example, the property left argument, which specializes argument for a BINARY RELATION, is specified as being unique (by the interval (1, 1)). Moreover, its filler is restricted to
OBJECT ATOMIC OBJECT CONTENT ELEMENT
MATHEMATICAL OBJECT SET STRUCTURING POWER SET ELEMENT +base set
VARIABLE PARENTHESIS
CONSTANT INNER PARENTHESIS
(restr SET)
SUBFORMULA
STRUCTURED OBJECT
MULTI OBJECT
+component
TERM FORMULA THEOREM
+embedding DE MORGAN (restr FORMULA)
OUTER PARENTHESIS ENCLOSED DE MORGAN 1 FORMULA LEFT RIGHT DE MORGAN 2 +brackets PARENTHESIS PARENTHESIS (restr PARENTHESIS)
argument (spec component) operator (spec component)
SYMMETRY +central operator (1,1)
DISTRIBUTIVITY +central operator (2,2)
Fig. 3. A fragment of the intermediate representation of objects.
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
97
RELATION +argument
FUNCTIONAL RELATION +result (1,1)
BINARY RELATION
UNARY RELATION
left argument argument (1,1) (1,1, spec argument) right argument (1,1, spec argument)
LOGICAL RELATION
COMPANION party (2,2, spec argument)
UNION COMPLEMENT
CONTAINMENT
right argument LOGICAL (spec argument, restr SET) SET RELATION left argument
argument (restr SET)
(spec argument)
SUBSET
INTERSECTION
SEMANTIC
+truth of relation (1,1) RELATION
SET PROPERTY
FUNCTIONAL SET RELATION argument (restr SET) result (restr SET)
SET RELATION
whole (spec container, right argument) part (spec containee, left argument)
container (1,1, spec argument) containee (1,1, spec argument)
ELEMENT collection (spec container, right argument) member (spec containee, left argument)
Fig. 4. A fragment of the intermediate representation of relations.
be a SET for SET PROPERTY (Fig. 4). Note that the category of some mathematical concepts is motivated by their cognitive role, and may sometimes deviate from corresponding constructs in XMEGA. For example, ‘‘subset’’ is modeled as a relation between two sets, in contrast to ‘‘powerset’’, which is expressed as a specialization of a set, giving access to the ‘‘base’’ set from which the elements of a power set are build. In contrast, both mathematical objects are expressed as logical relations in XMEGA, the subset relation holding between the two sets compared, and the powerset relation holding between the ‘‘base’’ set and the set consisting of all its subsets. Through multiple inheritance, information from several places can be collected. For example, UNION and INTERSECTION inherit a left and a right argument from BINARY RELATION, which are both restricted to be SETS, since this is the restriction of argument specified for FUNCTIONAL SET RELATION. Moreover, an implicit rectification can be defined through multiple inheritance. For example, the property party specified for COMPANION becomes a view of the left and right argument specified for BINARY RELATION when percolated to UNION and INTERSECTION, thereby regulating the correspondences between informal terms and precise mathematical ones (see below). The re-representations in the intermediate representation are extended in several ways: • the representation of informal terms; • the representation of typographic features; • association of procedural tests with conceptual specifications. Informal terms, e.g. ‘‘containment’’ and ‘‘companion’’, are represented as semantic relations. They are conceived as generalizations of mathematical relations in terms of their semantics (Fig. 4). For example, ‘‘containment’’ holds between two items if the first belongs to the second, or all its components separately do. This applies to ‘‘subset’’ and ‘‘element-of’’ relations. Similarly, ‘‘companion’’ comprises ‘‘union’’ and ‘‘intersection’’ operations. Typographic features represent mathematical objects ‘‘physically’’. This includes markers such as parentheses, as well as orderings, such as sides of an equation. They are modeled as properties of structured objects, in addition to the structural components which make up the semantics of the logical system. Moreover, typographic properties may be expressed as parts of specializations,
98
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
such as bracket-enclosed formulas as a specific kind of a (sub-)formula (Fig. 3). Altogether, the definitions associated with STRUCTURED OBJECT and its specializations in (Fig. 3) present a structural perspective on mathematical objects, while the definitions of RELATION and its specializations in (Fig. 4) constitute a functional perspective. Both are complementary in representing linguistic references to domain concepts. Some of the objects may be associated with procedural tests applied to typographical properties. They are accessible to the analysis module (not depicted in Figs. 3 and 4). They express, for instance, what makes a ‘‘parenthesis’’ an ‘‘inner parenthesis’’, or what constitutes the ‘‘embedding’’ of a formula.
5. Analysis techniques In this section, we present the analysis methodology and show interactions with the knowledge representation presented in Section 4. The analysis proceeds in three stages: (i) mathematical expressions are identified, analyzed, categorized, and substituted with default lexicon entries encoded in the grammar (Section 5.1); (ii) next, the input is syntactically parsed, and a representation of its linguistic meaning is constructed compositionally along with the parse (Section 5.2) and (iii) the linguistic meaning representation is subsequently embedded within the discourse context and interpreted by consulting a mapping lexicon (Section 5.3) and the ontology (Section 4). 5.1. Analyzing formulas The task of the mathematical expression parser is to analyze mathematical content within sentences. The identified mathematical expressions are subsequently verified as to syntactic validity, and categorized as of type CONSTANT, TERM, FORMULA, 0_FORMULA (formula missing left argument), etc. Identification of mathematical expressions within the word-tokenized text is based on simple indicators: single character tokens, mathematical symbol unicodes, and new-line characters. This is a simplistic approach that has nevertheless shown a reasonable performance. The parser uses a standard grammar for mathematical expressions to convert the infix notation into an expression tree. Domain-knowledge used at this stage includes, for instance, information pertaining to identifying expression type (e.g. formula vs. term). This is done based on structural information about the top node operator in the (sub-)tree percolated downward. Dedicated procedures operate on the tree representation to retrieve information about surface sub-structure of the expressions. The selection of retrieval procedures operating on the tree is motivated by systematic occurrence of reference to typographical features of mathematical expressions (such as markers, orderings; e.g. ‘‘apply a rule to the left side of formula’’). Results of these procedures as well as expression types are associated with name labels corresponding to nodes in the intermediate representation of objects, and discourse referents are created for those entities. Through the name association, we have access to properties of the expression components encoded in the representation of objects (cf. Fig. 3). Moreover, names of expression types (e.g. FORMULA, TERM) have direct counterparts in the grammar of the syntactic parser used at the later stage of interpretation. Finally, at the final stage of mathematical expression preprocessing,
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
99
Fig. 5. Tree representation of the formula K((A [ B) \ (C [ D)) = (K(A [ B) [ K(C [ D)).
the names of mathematical expression types are substituted within the sentence in place of the symbolic expressions. The syntactic parser processes the sentence without the symbolic content. For example, the expression K((A [ B) \ (C [ D)) = (K(A [ B) [ K(C [ D)) in utterance (1) in Fig. 2, is represented by the formula tree in Fig. 5. The parenthesis subscripts indicate that in the linear representation brackets enclosed the sub-expression headed by the given node. Given the expressionÕs top node operator, =, it is of type FORMULA, its ‘‘left side’’ is the expression K((A [ B) \ (C [ D)), the list of bracketed sub-expressions includes: A [ B, C [ D, (A [ B) \ (C [ D), etc. 5.2. Analyzing natural language expressions The task of the natural language analysis module is to produce a linguistic meaning (LM) representation of sentences and fragments that are syntactically well-formed. The sentence meaning obtained at this stage of processing is domain-independent. Domain-specific interpretation is assigned at the next stage (Section 5.3). By linguistic meaning, we understand the deep semantics in the sense of Prague school as employed in Functional Generative Description (FGD) [14,18]. LM is conceptually related to logical form, however, differs in coverage: while it does operate on the level of deep semantic roles (tectogrammatical relations), such aspects of meaning as the scope of quantifiers or interpretation of plurals are not resolved. In the Praguian FGD approach, the central frame unit of a sentence/ clause is the head verb that specifies in which semantic dependency relations its dependents (or participants) stand to it.2 A distinction is drawn between inner participants (Actor, Patient, Addressee, Origin, Effect) that form a verbÕs valency frame, and adverbial free modifications, such as Location, Means, Direction, Cause, Condition, Norm. Parsing is performed using openCCG, an open source multi-modal combinatory categorial grammar (MMCCG) parser.3 MMCCG is a lexicalist grammar formalism in which application of combinatory rules is controlled though context-sensitive specification of modes on slashes [3,5]. The LM, built in parallel with the syntax, is represented using Hybrid Logic Dependency Semantics (HLDS), a hybrid modal logic representation which allows a compositional, unification-based construction of HLDS terms with CCG [4]. The parserÕs grammar explicitly encodes 2
The relations here are Ôdeep rolesÕ in the linguistic sense, not to be confused with the semantic relations in the intermediate representation. 3 http://www.openccg.sourceforge.net.
100
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
Fig. 6. Tectogrammatical representation of the utterance ‘‘Nach deMorgan-Regel-2 ist K((A [ B) \ (C [ D)) = (K(A [ B) [ K(C [ D))’’. FORMULA is a lexical entry associated with the expression K((A [ B) \ (C [ D)) = (K(A [ B) [ K(C [ D)).
lexical itemsÕ valency frames (in terms of tectogrammatical relations between heads and dependents) as modal relations in the HLDS structures. The parserÕs grammar encodes syntactic signs for the mathematical expression types in the same way as for other lexical entries; for instance, syntactic categories for a lexical entry FORMULA, associated with a mathematical expression, are S, NP, and N, while for the word ‘‘formula’’, the sign is N. More details on issues in parsing interleaving mathematical and natural language expressions can be found in [21,22]. For example, in the utterance (1) in Fig. 2 Nach deMorgan-Regel-2 ist K((A [ B) \ (C [ D)) = (K(A [ B) [ K(C [ D)) the verb ‘‘ist’’ represents the meaning hold, and in this frame takes dependents in the tectogrammatical relations Norm and Patient forming the dependency structure presented in Fig. 6. The identified mathematical expression is categorized as FORMULA. Default lexical entries (e.g. FORMULA; cf. Section 5.1), are encoded in the grammar of the parser for mathematical expression types. 5.3. Domain interpretation In order to interpret the linguistic meaning of utterances in the context of the domain (here: naive set theory), we proceed as follows: firstly, we assign more specific lexical-semantic information to the elements of dependency structures produced during parsing, secondly, we attempt to find corresponding domain-specific interpretations of the concepts in the domain-ontology. We have built a linguistically-motivated mapping lexicon that relates linguistic realizations (i.e. ‘‘lexical triggers’’) of concepts to their domain interpretations through the ontology (cf. Section 4). The role of the ontology is to provide domain-specific information and a direct link to a mathematical knowledge base.4 Fig. 7 presents some of the lexicon entries. Below, we explain the structure of the lexicon entries and elaborate on the entries in more detail. A lexicon entry consists of two parts. One part contains information on dependency frames of lexical items as encoded by the grammar of the syntactic parser. For verbal entries, it specifies the
4
If more than one domain-specific interpretation is plausible, no disambiguation is performed. Alternative interpretations are passed on to the Proof Manager (cf. Fig. 1) for evaluation by the theorem prover or dialog specific strategies are used to elicit disambiguation on the part of the student [12].
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
101
Fig. 7. An excerpt of the mapping lexicon.
tectogrammatical relations of the relevant dependents, for example, for the verb ‘‘enthalten’’ (‘‘contain’’), it specifies the relevant inner participants of the verb, in this case: Actor and Patient (abbreviations indicated in capital letters). Moreover, it specifies the sorts of dependents according to the types in the intermediate representation of objects (cf. Fig. 3) indicated by subscripts. For non-verbal entries, such as prepositions and adjectives, it specifies which tectogrammatical relation (e.g. Location) or which particular lexical item triggers the concept. In these cases, the arguments refer to the relevant elements of the frame of the clausal head in which the triggering item occurs. For example, for the adjective ‘‘common’’, it specifies the plural Actor of the clausal predicate. The second part of an entry provides a mapping to a concept in the intermediate representation of objects or relations by specifying which arguments of a node in the ontology map onto the elements of the dependency structure. For example, for the concept CONTAINMENT in the intermediate representation, with the arguments container and containee, the role of container is taken by the Actor and the role of containee by the Patient of the verb ‘‘contain’’. • Containment. The containment relation, (15), as indicated in Fig. 4, specializes into the domain relations of (strict) SUBSET and ELEMENT. Linguistically, it can be realized, among others, with the verb ‘‘enthalten’’ (‘‘contain’’). The relevant elements of the tectogrammatical frame of ‘‘enthalten’’ are the dependents in the relations of Actor and Patient, which take on the roles of container and containee, respectively. • Location. The Location relation, realized linguistically by the prepositional phrase ‘‘in. . .(sein)’’ (‘‘be in’’) involves the tectogrammatical relations of Location (LOC) and the Actor of the predicate ‘‘sein’’. We consider Location in our domain as synonymous with CONTAINMENT. Translation rule (16) in Fig. 7 serves to interpret the tectogrammatical frame of one of the instantiations of the Location relation.
102
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
• Common property. We define as a general notion of ‘‘common property’’ as in (17). Property here is a meta-object which can be instantiated with any element of a tectogrammatical frame indicated by the subscript; here the immediate head (IH) that the adjective modifies; this is, for example the case with relational nouns such as ‘‘element’’, as in ‘‘(A und B)
haben (gemeinsame Elemente)’’ (‘‘A and B have common elements’’). This instantiation of the entry is shown in (17). • Difference. The Difference relation, (18), realized linguistically by the predicates ‘‘verschieden (sein)’’ (‘‘be different’’; for SETS or STRUCTURED OBJECTS) and ‘‘disjunkt (sein)’’ (‘‘be disjoint’’; for SETs) involves a plural Actor (e.g. coordinated noun phrases). • Mereological relations. Here we encode part-of relations between domain objects. These concern both physical surface and ontological properties of objects. Commonly occurring partof relations in our domain are: component(STRUCTURED OBJECTTERM, STRUCTURED OBJECTSUBTERM) component(STRUCTURED OBJECTFORMULA, STRUCTURED OBJECTSUBFORMULA) component(STRUCTURED OBJECTTERM, STRUCTURED OBJECTENCLOSED TERM) component(STRUCTURED OBJECTFORMULA, STRUCTURED OBJECTENCLOSED FORMULA) Moreover, we have from the ontology (cf. Fig. 3): Property(STRUCTURED OBJECTTERM, componentTERM SIDE) Property(STRUCTURED OBJECTFORMULA, componentFORMULA SIDE) Using these definitions and polysemy rules such as polysemous(Object, Property), we can obtain interpretation of utterances such as ‘‘Dann gilt fu¨r die linke Seite,. . .’’ (‘‘Then for the left side it holds that . . .’’) where the predicate ‘‘gilt’’ (‘‘hold’’) normally takes two arguments of types STRUCTURED OBJECTTERM, FORMULA, rather than an argument of type Property.
6. Example analysis Here, we present an example analysis of the utterance ‘‘B contains no x 2 A’’ ((4) in Fig. 2) to illustrate the mechanics of the approach. The scope of negation in this utterance is over a part of the formula following it, rather than the whole formula. The verb ‘‘contain’’ evokes the semantic relation of CONTAINMENT and is ambiguous between the domain readings of (STRICT) SUBSET, ELEMENT, and SUBFORMULA. The analysis proceeds as follows. The formula tagger first identifies the formula and substitutes it with the generic entry FORMULA represented in the lexicon of the grammar. If there was no prior discourse entity for ‘‘B’’ to verify its type, the type is ambiguous between CONSTANT, TERM, and FORMULA.5 The sentence is assigned four alternative readings: ‘‘CONSTANT contains no FORMULA’’, ‘‘TERM contains no FORMULA’’, ‘‘FORMULA contains no FORMULA’’, and ‘‘CONSTANT contains no CONSTANT 0_FORMULA’’. The last reading is obtained by taking into account possible interaction of mathematical expres5
In prior discourse, there may have been an assignment B :¼ /, where / is a formula, in which case, B would be known from discourse context to be of type FORMULA (similarly for term assignment); by CONSTANT we mean a set or element variable such as A, x denoting a set A or an element x, respectively.
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
103
Fig. 8. Tectogrammatical representation of the utterance ‘‘B enthaelt kein ’’ [B contains no x 2 A].
Fig. 9. Tectogrammatical representation of the utterance ‘‘B enthaelt kein <[x][2A]>’’ [B contains no <[x][2A]>].
sions with the preceding natural language context. There, the expression has been split into its surface parts, <[x][2A]>, [x] has been substituted with a lexical entry CONSTANT, and [2A] with an entry for a formula missing its left argument, 0_FORMULA6 (cf. Section 5.1). The first and the second readings are rejected because of sortal incompatibility. The resulting linguistic meaning for the reading (i) ‘‘FORMULA contains no FORMULA’’ is presented in Fig. 8 and for the reading (ii) ‘‘CONSTANT contains no CONSTANT 0_FORMULA’’ in Fig. 9. The mapping lexicon and the ontology are consulted to translate the readings into their domain interpretations. The relevant entries triggered by the head ‘‘enthalten’’ (‘‘contain’’) are (15) in Fig. 7. The linguistic meaning representations are further decorated with information about the domain-specific readings of the elements of the tectogrammatical frames (Fig. 10). The relevant concept here is the semantic relation of CONTAINMENT that in the domain of naive set theory specializes into the concepts of SUBSET and ELEMENT. The following four interpretations of the utterance are obtained: for the reading (i) ‘‘FORMULA contains no FORMULA’’: (1) Ôit is not the case that , formula x 2 A, is a subformula of , formula B 0 and for the reading (ii) ‘‘CONSTANT contains no
CONSTANT 0_FORMULA’’:
(2a) Ôit is not the case that , the constant x, , B, and x 2 A 0 , (2b) Ôit is not the case that , the constant x, 2 , B, and x 2 A 0 , (2c) Ôit is not the case that , the constant x, , B, and x 2 A 0 . 6
There are other ways of constituent partitioning of the formula at the top level operator to separate the operator and its arguments (they are: <[x][2][A]> and <[x2][A]>). Each of the partitions obtains its appropriate type corresponding to a lexical entry available in the grammar (e.g., the [x2] chunk is of type FORMULA_0 for a formula missing its right argument). Not all the readings, however, compose to form a syntactically and semantically valid parse of the given sentence.
104
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
Fig. 10. Interpreted representations of the utterance ‘‘B enthaelt kein x 2 A’’.
The first interpretation, (1), is verified in the discourse context with information on structural parts of the discourse entity ‘‘B’’ of type FORMULA, while the other three, (2a)–(2c), are translated into messages to the Proof Manager and passed on for evaluation in the proof context.
7. Conclusions and future research In this paper, we presented methods for analyzing telegraphic natural language text with embedded formal expressions. We are able to deal with major phenomena observed in a corpus study on tutorial dialogs about proving mathematical theorems, as carried out within the DIALOG project. Our techniques are based on an interplay between a formula interpreter and a linguistic parser which consult an enhanced domain knowledge base and a mapping lexicon. Given the considerable demand on interpretation capabilities imposed by tutoring system contexts it is hardly surprising that we are still at the beginning of our investigations. The most obvious extension for meeting tutoring purposes concerns dealing with errors in a cooperative manner. This requires the two analysis modules to interact in an even more interwoven way. Another extension concerns the domain-adequate interpretation of semantically complex operators such as Ôvice-versaÕ as in (13) Fig. 2. ÔVice-versaÕ is ambiguous here in that it may operate on immediate dependent relations or on the embedded relations. The utterance ‘‘and this also holds vice-versa’’ in (13) may be interpreted as ‘‘alle K(B) in A enthalten sind’’ (‘‘all K(B) are contained in A’’) or ‘‘alle B in K(A) enthalten sind’’ (‘‘all B are contained in K(A)’’) where the immediate dependent of the head enthalten and all its dependents in the Location relation are involved (K(B)), or only the dependent embedded under General Relation (complement, K). Similarly, ‘‘head switching’’ operators require more complex definition. For example, the ontology defines the theorem SYMMETRY (or similarly DISTRIBUTIVITY, COMMUTATIVITY) as involving a functional operator and specifying a structural result. On the other hand, linguistically, ‘‘symmetric’’ is used predicatively (symmetry is predicated of a relation or function).
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
105
A further yet to be completed extension concerns modeling actions of varying granularity that impose changes on the proof status. In the logical system, this is merely expressed as various perspectives of causality, based on the underlying proof calculus. Dealing with all these issues adequately requires the development of more elaborate knowledge sources, as well as informed best-first search strategies to master the huge search space that results from the tolerance of various kinds of errors. We have investigated flexible interpretation of utterances as observed in our study primarily for the use in tutoring systems. This enterprise is not only important for the primary purpose addressed, but also for improving accessibility and usability of formal systems for supporting domain experts, for example to enhance the working environment of mathematicians (see the initiative related to managing mathematical knowledge [2]).
References [1] V. Aleven, K. Koedinger, The need for tutorial dialog to support self-explanation, in: Papers from the 2000 AAAI Fall Symposium on Building Dialogue Systems for Tutorial Applications, AAAI Press, Cape Cod, MA, 2000, pp. 65–73. [2] A. Asperti, B. Buchberger, J.H. Davenport (Eds.), Mathematical Knowledge Management, Second International Conference, MKM 2003, Bertinoro, Italy, Lecture Notes in Computer Science, Vol. 2594, Springer, 2003. [3] J. Baldridge, Lexically specified derivational control in combinatory categorial grammar, Ph.D. Thesis, University of Edinburgh, Edinburgh, 2002. [4] J. Baldridge, G.-J. Kruijff, Coupling CCG with hybrid logic dependency semantics, in: Proceedings. of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-02), Philadelphia PA, 2002, pp. 319–326. [5] J. Baldridge, G.-J. Kruijff, Multi-modal combinatory categorial grammar, in: Proceedings of the 10th Annual Meeting of the European Chapter of the Association for Computational Linguistics (EACL-03), Budapest, Hungary, 2003, pp. 211–218. [6] C. Benzmu¨ller, A. Fiedler, M. Gabsdil, H. Horacek, I. Kruijff-Korbayova´, M. Pinkal, J. Siekmann, D. Tsovaltzi, B. Vo, M. Wolska, Tutorial dialogs on mathematical proofs, in: Proceedings of IJCAI-03 Workshop on Knowledge Representation and Automated Reasoning for E-Learning Systems, Acapulco, Mexico, 2003, pp. 12– 22. [7] C. Benzmu¨ller, A. Fiedler, M. Gabsdil, H. Horacek, I. Kruijff-Korbayova´, M. Pinkal, J. Siekmann, D. Tsovaltzi, B. Vo, M. Wolska, A Wizard-of-Oz experiment for tutorial dialogues in mathematics, in: AIED2003— Supplementary Proceedings of the 11th International Conference on Artificial Intelligence in Education, Sydney, Australia, 2003, pp. 471–481. [8] R.J. Brachman, J. Schmolze, An overview of the KL-ONE knowledge representation system, Cognitive Science 9 (2) (1985) 171–216. [9] A. Fiedler, M. Gabsdil, Supporting progressive refinement of Wizard-of-Oz experiments, in: Proceedings of the ITS 2002–Workshop on Empirical Methods for Tutorial Dialogue, San Sebastian, Spain, 2002, pp. 62–69. [10] A. Fiedler, D. Tsovaltzi, Automating hinting in mathematical tutorial dialogue, in: Proceedings of the EACL-03 Workshop on Dialogue Systems: Interaction, Adaptation and Styles of Management, Budapest, Hungary, 2003, pp. 45–52. [11] H. Horacek, A. Fiedler, A. Franke, M. Moschner, M. Pollet, V. Sorge, Representation of mathematical objects for inferencing and for presentation purposes, in: Proceedings of the 17th European Meetings on Cybernetics and Systems Research (EMCSR-04), Vienna, Austria, 2004, pp. 683–688. [12] H. Horacek, M. Wolska, Interpretation of potentially ambiguous statements in mathematics, in: Proceedings of the 7th Konferenz ‘‘Verarbeitung natu¨rlicher Sprache’’ (KONVENS-04), Vienna, Austria, 2004, pp. 65–72. [13] M. Kohlhase, A. Franke, MBase: Representing knowledge and context for the integration of mathematical software systems, Journal of Symbolic Computation 32 (4) (2000) 365–402.
106
H. Horacek, M. Wolska / Data & Knowledge Engineering 58 (2006) 90–106
[14] G.-J. Kruijff, A categorial-modal logical architecture of informativity: dependency grammar logic and information structure, Ph.D. Thesis, Charles University, Prague, 2001. [15] E. Melis, J. Bu¨denbender, E´. Andres, A. Frischauf, G. Goguadze, P. Libbrecht, M. Pollet, C. Ullrich, ACTIVEMATH: a generic and adaptive web-based learning environment, Artificial Intelligence in Education 12 (4) (2001) 385–407. [16] J. Moore, What makes human explanations effective? in: Proceedings of the 15th Annual Conference of the Cognitive Science Society, Hillsdale, NJ. Earlbaum, 1993. [17] Available from: SFB 378 . [18] P. Sgall, E. Hajicˇova´, J. Panevova´, The Meaning of the Sentence in its Semantic and Pragmatic Aspects, Reidel Publishing Company, Dordrecht, 1986. [19] J.H. Siekmann, C. Benzmu¨ller, V. Brezhnev, L. Cheikhrouhou, A. Fiedler, A. Franke, H. Horacek, M. Kohlhase, A. Meier, E. Melis, M. Moschner, I. Normann, M. Pollet, V. Sorge, C. Ullrich, C-P. Wirth, J. Zimmer, Proof development with XMEGA, in: Proceedings of the 18th Conference on Automated Deduction, Copenhagen, Denmark, 2002, pp. 144–149. [20] TRINDI project: Available from: . [21] M. Wolska, I. Kruijff-Korbayova´, Analysis of mixed natural and symbolic language input in mathematical dialogs, in: Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL-04), Barcelona, Spain, 2004, pp. 25–32. [22] M. Wolska, I. Kruijff-Korbayova´, Building a dependency-based grammar for parsing informal mathematical discourse, in: Proceedings of the 7th International Conference on Text, Speech and Dialogue (TSD-04), Lecture Notes in Computer Science, vol. 3206, Springer, Brno, Czech Republic, 2004, pp. 645–652. Helmut Horacek is a senior researcher at Saarland University, Germany. He has studied computer science at the Technical University of Vienna, Austria. He received his doctoral degree in technical sciences from that university in 1982. Since then, he has worked on research projects related to natural language processing at University of Vienna, and later at University of Hamburg, Germany. From 1989 to 1995 he took a position as an assistant professor for computational linguistics at the University of Bielefeld. From 1995 to 1996 he was an associate professor for Information systems at University of Constance. Since 1996, he was involved in various projects at the German Research Center for Artificial Intelligence (DFKI) and at Saarland University, both in Saarbru¨cken, Germany. His research areas include natural language generation, dialog modeling, discourse analysis and planning, pragmatics, knowledge representation, searching, game-playing, and tutorial systems. He has published over 100 papers in journals and conference proceedings, he is a regular reviewer for top journals, and he has served as a programme committee member in several international conferences and workshops.