Int. J. Man-Machine Studies (1991) 35, 35-67
PROLEXS: creating law and order in a heterogeneous domaint R. F. WALKER,A. OSKAMP,J. A. SCHRICKX,G. J. VAN OPDORP AND P. H . VAN DEN BERG Computer/Law Institute, Vrije Universiteit, De Boelelaan 1105, 1081 HV, Amsterdam, The Netherlands This article defines a heterogeneous domain as a domain in which typical problems can only be solved by combining several distinctive knowledge sources. The legal domain, in this view, must be considered heterogeneous since classical rule-based knowledge sources like legislation, cooperate with expertise and case-law, both possibly represented quite differently, to produce useful results. The article continues with the description of the architecture of PROLEXS, an expert system built to operate in such domains. An example dialogue is added to sketch the problems of building knowledge-based systems in heterogeneous domains and the way PROLEXS approaches those problems. Finally the current PROLEXS research involving neural networks and case-based reasoning is introduced.
1. Introduction The research of the C o m p u t e r / L a w Institute at the Vrije Universiteit of Amsterdam on legal expert systems converges in the P R O L E X S p r o j e c t . , In the course of this project we were confronted with a number of problems in building legal expert systems. In this paper we describe one of the-fundamental problems, the diverse nature of the legal sources, both with~regard to representation and legal reasoning. T h e most natural solution to this problem seems to us an independent representation of the knowledge sources by dividing the domain and defining a mechanism to control the various knowledge sources. Not only the shell should account for this diversity, but also a knowledge acquisition and formalization methodology had to be designed. In this paper the approach as taken b y the P R O L E X S project with respect to the above problems will be discussed. W e will explain the working of the shell and give an example of a dialogue with the prototype on Dutch landlord-tenant law. We will start with a short explanation of the assumption and theories which form the points of departure.
2. Points of departure of PROLEXS 2.1. AIM OF THE PROLEXS PROJECT The aim of the P R O L E X S project is twofold. We want to build a shell or toolkit for building legal expert systems, which should do justice to the specific nature of law t This article is a completely revised and significantly extended version of our paper, entitled "PROLEXS, DIVIDE AND RULE: a legal application", which appeared in the Proceedings of the Second International Conference on AI and Law, VancouverCanada 1989. PROLEXS stands for PROtotype of a Legal EXpert System. The project is sponsored by the Dutch Computer Science Stimulation Plan (SPIN), Apollo Computer Inc. (since 1986) and IBM (since 1988). 35 0020-7373/91/010035+ 33503.00/0
9 1991 Academic Press Limited
36
R. F. WALKER ET A L .
and its special requirements, e.g. relating to the use of vague concepts. As w e realize that offering only such a shell or toolkit will hardly make it easier to build a legal expert system--the hardest part being to fill the system with knowledge---we want at the same time to offer a methodology to build one's own legal expert system with the help of this shell/toolkit. To accomplish this and at the same time get the opportunity to develop (further) and test theories on the various aspects o f formalization of legal knowledge, we are also developing a prototype of an expert system on Dutch landlord-tenant law. In the PROLEXS Project we adopt an overall approach relating to the development of legal expert systems,t which means that we try to cover the various aspects from an interdisciplinary point of view: building a shell, knowledge acquisition, formalization and representation of legal knowledge, conditions for building legal expert systems, etc. The result of our research should not only be a prototype of a legal expert system, which can be used in practice, but should also give a strong theoretical foundation for legal expert systems in general. The project group is multi-disciplinary and consists of lawyers, computer scientists, a logician and a psychologist who performs related statistical research. 2.2. TASK OF THE LANDLORD--TENANT LAW SYSTEM
The task we want our present prototype to perform is to offer legal advice at an expert level in relatively simple cases that often occur. One could call this "first aid" legal help. However, the fact that these cases occur frequently and can thus b e considered as standard cases does not mean that, for the solution of the legal problems they pose, expert knowledge is not necessary. O n the contrary, although the type of cases may be standard, the individual circumstances offer a wide variety of possibilities. To give an adequate solution to the greater part of these cases therefore asks for a broad knowledge of landlord-tenant law and for expert knowledge. Handling these cases at an expert level, a task for which experts often will not find the time, will improve the level of legal aid. In order to reach our goal of potential usefulness we cooperate closely with "Rechtshulp VU", a legal clinic, which has a long experience in the kind of help we want our expert system to offer. At the same time this gives us the advantage o f having an "in-house" expert and relatively easy access to the specific (legal) knowledge necessary to perform the task. An additional advantage is that the task is specific and well-circumscribed. It concerns a specific kind of case, concentrating o n the renting of houses and specific standards for the output: the same output as is given by the legal clinic, which includes forms to be filled in, letters to be sent to the landlord, advice, etc (van den Berg & Oskamp, 1986). However, when a case has to be taken to court, the client is advised to consult a professional lawyer. At that point the system can give all information it has achieved: a legal analysis, references to statutes, to directives, to cases, etc., together with an indication of the relevance of those references to the case at hand, and, if appropriate, advice on how to handle the case (Oskamp, 1990). t We also see legal expert systems as part of an integrated legal informaties systems, integrating expert systems with, for instance, legal information retrieval systems (Oskamp & Vandenberghe, 1986). In 1990 we we will try to connect PROLEXS to the Kluwer legal databank.
LAW AND ORDER IN A HETEROGENEOUS DOMAIN
37
2.3. U S E R O R I E N T E D A P P R O A C H
It is our intention to try to pull legal expert systems out of a laboratory environment (cf. Susskind, 1987a, 1987b) and make them potentially useful in practice. For that purpose we developed a specific procedure to define both a suitable domain and a suitable task and to select a part of the expert knowledge necessary to ensure this potential usefulness. In this procedure, which we call the user oriented approach and which has been published elsewhere (Oskamp & Vandenberghe, 1986; Oskamp, 1986), the future users of the system hold a prominent place, for it is they who will in fact have to use the expert system. Both the task and the domain (landlord-tenant law) of our prototype were chosen following this approach. A further important consideration relating to users of legal expert systems is that in our view users of legal expert systems in general, and users of PROLEXS in specific, have to have a basic knowledge of law. They have to classify the problem as a legal problem and they have to consider whether a problem falls within the domain of the expert system (Leith, 1984; Greenleaf, Mowbray & Tyne, 1987).
3. A model of legal knowledge representation 3.1. S O U R C E S O F K N O W L E D G E F O R THE SYSTEM
The idea behind the knowledge representation in PROLEXS (both the general shell and the landlord-tenant law system) is that an expert system should, as far as possible, at least use the same knowledge that an expert, performing the same task as the system, would need to perform this task. Of course, much depends on the kind of task, but for most tasks this implies that this knowledge should at least come from the traditional sources of law, including the statutes (we call this the legislation group), case law (the case law group), all treatises trying to explain vague and ambiguous concepts, such as governmental and municipal directives, papers and books from leading authors in the field (the legal doctrine group). However this will not be sufficient, a s an expert also uses other kinds of knowledge, gathered by experience (the expert knowledge group). This knowledge is essential for expert performance (Hayes-Roth, Waterman & Lenat, 1983). Furthermore the system will need legal meta-knowledge, such as expert strategies, in order to be able to perform its task in the best way possible. In legal meta-knowledge two levels can be distinguished: general legal meta-knowledge, and specific meta-knowledge. General legal meta-knowledge is applicable to the whole legal domain and contains knowledge referring to the hierarchy in legal sources. Of course this general legal meta-knowledge depends heavily on the legal system. In the Dutch legal system the hierarchy of legal sources consists of legal statues, case law, and legal doctrine of which statutes are the most important group. Another example of general legal meta-knowledge is the hierarchy of courts. Specific legal meta-knowledge is more closely related to the domain and the task of the expert system and consists, for instance, of expert strategies (Oskamp, 1990). In the present version of PROLEXS no distinction is made between general and specific legal meta-knowledge. However it will be possible to implement such a distinction in a later version.
38
R. F. WALKER ET A L .
3.2. STRUCTURE OF A LEGAL KNOWLEDGE BASE
We developed a specific structure for organizing legal knowledge bases, which w e described more fully in previous papers (Oskamp, 1989) and which will be the main subject of sections 4 to 6. The idea behind it is that all knowledge is represented only once and is basically grouped relating to its origin, either from the traditional sources of law (we call these knowledge sources legislation, case law and legal doctrine) or from the expert (the expert knowledge source). In this way the status o f a specific piece of knowledge will be dear, as it is very easy to trace its origin. Meta-level knowledge (found in the combination mechanism, which will b e discussed next) will make it possible to state the importance (and priority) of a piece of knowledge in relation to other pieces. Meta-level knowledge may for instance state that knowledge coming from statutes has a higher status than knowledge from case law or it may comprise expert strategies. Distinguishing the various kinds of knowledge that are to be represented in a legal knowledge base and grouping them in relation to their origin makes it possible to represent them in a way which is best suited to the nature of this knowledge. Legal knowledge and consequently the knowledge of legal expert systems is too diverse to suppose that to represent it in the best possible way one can use the same format for all kinds of knowledge. That would not do justice to the special character o f statutes, case law, etc. and to the way in which lawyers reason with rules, cases, etc (Ashley, 1988; Rissland & Skalak, 1988). It is not yet clear which technique is the best to represent specific knowledge. In PROLEXS we started by using rules for knowledge coming from statutes, legal doctrine and for expert knowledge and by using a kind of frame for case law. This turned out to be sufficient for the moment, but not optimal. We are looking into the possibility of defining the various kinds of knowledge in the knowledge sources (within the knowledge source legislation we can, for example, distinguish rules and definitions; the first can be best represented in production rules, which will probably not be the most efficient representation format for the second type of knowledge) and look for the best technique for each kind of knowledge (Oskamp, 1990). Then within the knowledge sources we will find knowledge groups, each with their own knowledge representation language (KRL). (In P R O L E X S at this moment a knowledge source is represented using only one KRL, so each knowledge source is also a knowledge group; this, however, is not enforced by the shell.) For this reason we find it absolutely necessary that a legal expert system should have the possibility of reasoning with different kinds of knowledge representation formats. The PROLEXS system offers this possibility; it can reason with rules and frames and can easily cope with other knowledge representation as well (Walker & van den Berg, 1988).
3.3, COMBINATION MECHANISM
Interrelating the knowledge from those basic knowledge sources is the task of the "combination mechanism". This is a general term for meta-level knowledge and is partly found in further structuring the knowledge base. The combination mechanism takes care of the interrelation of the knowledge. For, although grouping knowledge
LAW AND ORDER IN A HETEROGENEOUS DOMAIN
39
to its origin will make it easier to keep an overview and help maintenance of the knowledge base, it will hardly add to, for instance, the performance and the transparency of the system. Interrelating the knowledge across knowledge sources can be effectuated by putting a structure over the basic knowledge groups. This structure does not affect the knowledge itself, but it guides the consultation and use of the knowledge. The structure does not have to be the same for every legal expert system, but can be acquired in relation to task and domain. A n example of such a structure is to let the combination mechanism group dusters of knowledge necessary to solve a specific problem. The combination mechanism does not hold the knowledge itself but only relates this knowledge via references. So the knowledge is only registered once and can be used for any kind of legal problem, provided it falls within the scope and domain of the system. The interrelation of knowledge also implies that the combination mechanism decides which knowledge to consult and in which order (Gardner, 1987).t In order to effectuate this interrelation optimaUy, the combination mechanism depends on expert knowledge (Susskind, 1987b).~
4. The PROLEXS architecture A typical problem in a heterogeneous domain can only be solved by combining several structurally different knowledge sources. This definition of a heterogeneous domain immediately generates a number of challenges to anyone building knowledge-based systems (KBS) in such a domain. (1) The different knowledge sources are probably best represented in a specific knowledge-representation-language (KRL). In addition, each KRL needs its own dedicated inference engine. (2) Once the different knowledge sources have been satisfactorily represented, they must interact to solve any problem that addresses more than one knowledge source; by the above definition most problems do. (3) Supposing these knowledge sources have ways to interact and share conclusions, a control mechanism must be defined to govern the behavior of their respective inference engines. Furthermore, it would be most convenient if this control mechanism could be as ignorant about the actual implementation of the knowledge sources as possible for the following reason. Contemporary KBSs are flexible, that is they separate the knowledge base from the shell. Obviously, this facilitates building a KBS for a new domain: the knowledge-base can be changed without affecting the shell. Surprisingly the same strategy does not seem to apply to the inference engine. Most KBSs incorporate the inference engine in the shell instead of the knowledge-base. This is rather peculiar since one can hardly expect the inference engine to be static if the KRLs are flexible. t Gardner (1987) proposes a similar idea referring to the sequence of legal events (p 123). ~:On pp. 56-61 Susskind states that it is better not to embody expert knowledge in legal expert systems yet. In our opinion this expert knowledge is absolutely necessary to open the possibility of even potential usefulness of legal expert systems.
40
R. F. WALKER ET AL.
Before going into a detailed description of the PROLEXS architecture in section 6, an extended example dialogue is presented in section 5, where the reasoning process of Charlotte, a human expert on landlord-tenant law, is compared with PROLEXS. To appreciate the example, a global introduction to the P R O L E X S control mechanism follows first. References to figures in section 6 will be made only to get an impression; they will be explained more thoroughly in that section. To address the different knowledge sources, each of which has unique characteristics according to the before mentioned definition of heterogeneous domains, PROLEXS defines knowledge groups. Each knowledge group has its own K R L and dedicated inference engine. The four knowledge groups in the current PROLEXS implementation are legislation, legal doctrine, expert knowledge and case-law. A typical problem in the legal domain assesses these four knowledge groups. The inference engines of the independent knowledge groups interact using a blackboard. The inference engine of a knowledge group may contain several reasoning methods. The legislation knowledge group for instance, is a rule based system and the inference engine comprises both backward and forward chaining. Conclusions derived in one knowledge group are written on the blackboard in a standard formatt and therefore readable to all inference engines (Figure 2). The knowledge groups form the lowest level in the P R O L E X S knowledge model, a three layered structure intended to capture domain knowledge (in the knowledge groups) in the first level, task oriented meta-knowledge in the second, and strategy oriented meta-knowledge in the third. The second level of the knowledge model describes logical groups. Each logical group contains domain knowledge from the knowledge groups relevant to a certain topic. The logical group maintenance for instance will contain a number of maintenance related cases from the case-law knowledge group in addition to all statutes present in the knowledge group legislation concerning maintenance (Figure 6). Without getting into details at this point (section 6.2 treats it more extensively) it should be obvious that such a construction comes in handy if P R O L E X S is confronted with a maintenance problem. To PROLEXS it is immediately clear which domain knowledge is related to maintenance and can be applied to the problem. No meandering through irrelevant chunks of knowledge is necessary. The top level of the knowledge model is occupied by the so-called classification network (Figure 7). This network contains links among the logical groups situated on the second level. Several types of links exist but these will be treated in section 6.3 on the classification network. These links are made by the expert and represent strategy related chains of logical groups. It informs P R O L E X S for instance that in order to get a rent reduction, the maximum rent must be known, which can only be derived given the quality-index of the apartment. This quality-index can only be calculated once it has been verified that the landlord-tenant law is indeed applicable which can only be known for sure by eliminating all exceptions. In the network this simple strategy appears as consecutive links between the logical groups rent reduction and exceptions. The name classification network refers to the slightly less important second task of t This format only describes factual knowledge like maintenance required or contract valid. Such observations are independent of the KRLs used to deduce them.
LAW AND ORDER IN A HETEROGENEOUS DOMAIN
41
this third layer: classifying a problem. PROLEXS is able to find out which logical group is most relevant with respect to the initial facts. On the classification network level the location of this logical group within the domain can be used to mobilize initially relevant knowledge and reasoning methods to solve the problem. At this point the knowledge model has been described: domain knowledge is represented in a number of KRLs (rules, frames and neural networks in the current implementation). On top of those, logical groups are formed, clustering domain knowledge with respect to certain topics thereby providing PROLEXS in large heterogeneous domains, with a much needed sense of relevancy. Finally those logical groups are networked to represent probable lines of thought. As will be made clear however in the description of the PROLEXS control mechanism, this meta-knowledge is not to be strictly applied to any problem; it merely offers expert supplied guidelines. The P R O L E X S control mechanism is called the agenda handler since it keeps the agenda for (all the reasoning methods within) the inference engines associated with the various knowledge groups. It is the agenda handler which dictates which reasoning method may proceed. The selection procedure takes into account feedback from the reasoning methods themselves. This feedback, called a report, tells the agenda handler: (1) whether the reasoning method has been successful in deducing new facts or perhaps even reaching the user supplied goals; (2) which part of the knowledge base the reasoning method is planning to use next (expressed in a list of logical groups); and (3) the expected workload should the reasoning method get permission to continue. In addition to this information, which is bundled in a report and submitted to the agenda handler by each reasoning method after each cycle,t the agenda handler uses the strategy laid down in the classification network. If this strategy informs the agenda handler that the expert in the same situation would examine the contract, the agenda handler is liable to select a reasoning method which proposes the same strategy (using the second item on the report). Besides selecting the reasoning method for the next cycle, the agenda handler will possibly activate a new logical group (contracts say). Activation of logical groups is a very important aspect of the PROLEXS control mechanism. Only knowledge inside active logical groups is visible to the inference engines. By placing logical groups on the active list, the agenda handler is able to let the inference engines concentrate on the topics it selects. Recall that by activating the logical group maintenance all knowledge related to maintenance will become active. This might involve a number of rules inside the legislation knowledge group and several frames inside the case-law knowledge group. The selection of the logical group to be activated is based on the comparison of all t W h a t is done during a cycle depends on the reasoning method: one cycle of forward chaining (together with backward chaining forming the inference engine of the legislation knowledge group) involves checking all possible deductions that can be made in one step from the current blackboard with respect to those rules that forward chaining is allowed to use in this cycle (The rules which forward chaining is allowed to use at one time are dynamically adjusted. This process will be treated shortly.)
42
R. F, WALKER
ET AL.
received reports from the reasoning methods (second report item) on the one hand and the strategic meta-knowledge from the classification network on the other. Obviously, if the situation occurs in which a reasoning method is successful and able to proceed without additional knowledge, no new logical group is activated.
5. An example of a PROLEXS dialogue Presenting a case to a legal expert system poses a number of problems. The problem representation might be a statement in plain English or a highly abstract scheme in predicate logic. Neither one is ideal: expert systems have great difficulties in understanding natural language unless the domain is very small (like a block world). It goes without saying that the legal profession uses language to a very high extent and is not restricted to a manageable idiom. PROLEXS therefore confronts the user with a large amount of attributes. The user assigns values to any number of attributes that seem applicable to the case. Attributes are clustered using the before mentioned logical groups and selecting relevant attributes is a hierarchical search, where selecting a logical group (i.e. topic, maintenance for instance) reveals all associated attributes (leaking roof, lack of heating, faltering electricity, etc.). Any attribute which has been associated with a value is called a fact and is written on the blackboard. All user-supplied facts together form the current fact situation and are used to find initially relevant knowledge~the starting focal point of the reasoning process (more about that in section 6.3). This is the second task of the classification network referred to in section 4 (Figure 1). The attributes shown to the user are an expert-supplied subset of all facts that can be processed by the PROLEXS inference engine. It is therefore not possible to feed facts to the system that it cannot digest. In other words: the user is restricted to attributes the system can understand and is not allowed to describe the case in his or her own words. The presented PROLEXS attributes are carefully chosen by the expert; a number of facts that can be processed by PROLEXS are nonetheless not shown to the user due to their vagueness or complexity (in either case it would be unwise to encourage the user to interpret such attributes with respect to the current case).
A p i--bizi/.HY i
Rent reduction ~"
I
~-A~.Quality-index~'-'-"--lHH}l]Maintenance IIII]1 ~ZZZ/XL4X,I/~A'/..AAXAZ~AI
Fmu~
IIIIIIIIIittlt111111t1111111111111111!1111111
1, A simple classification network used in the example dialogue,
LAW AND ORDER IN A HETEROGENEOUSDOMAIN
43
P R O L E X S has recently been installed as a prototype at a legal clinic associated with the university. Clients of this clinic are normally advised by law students assisted by an expert. It is this expert's knowledge, that has b e e n put in the expert knowledge group. P R O L E X S is operated by such a law student; it will not directly interact with the client. The students must be able to recognize the problem as a legal case and as a case on landlord-tenant law. T h e y also need to be able to extract the key facts of the case. By using P R O L E X S the law student should be able to process more cases because the expert is less frequently consulted. In the following dialogue P R O L E X S ' simulation of the expert will be placed in square brackets. A hypothetical dialogue with a human expert is added, which is based on a transcript o f one of our expert interviews during the knowledge acquisition phase. Since only a brief introduction has so far been given to the P R O L E X S control mechanism, the number behind the opening bracket will b e referred to in section 6 where the P R O L E X S knowledge model is t r e a t e d more elaborately. L e t us consider the following case: Joren has just moved into a house near the c e n t e r of Amsterdam. After having signed the contract he wonders if he does not pay t o o much rent and whether a reduction is possible. Therefore Joren consults Charlotte, an expert at a legal clinic. The following facts are subjoined: the rent is ft. 315 a month; the address is Rembrandtstreet 27, Amsterdam, first floor; the house possesses an exclusive entrance; the landlord has let the house for five years as he will stay and work in Utrecht during this time. [1. Presenting a case to P R O L E X S proceeds by writing facts on the blackboard. T h e exact process is hidden from both the client and the operator; the o p e r a t o r only selects facts on screen. Having selected an attribute (for instance ( r e n t ) ) , a value must be associated with it (ft. 315 in this case), after which P R O L E X S writes the fact on the blackboard. T h e operator will abstract some of the very specific data; an address to a zip-code for instance. Being a law student, the operator is also able to extract the essentials from the case description as submitted by the client. T h e blackboard is initially not empty, since before any session a number of predefined defaults has already been written on it. Possible defaults may include the fact that the contract is valid. Of course any sign to the contrary will eliminate the default value, which has very low priority.t In this case the amount of rent, the zip code, and the exclusive entrance are m e n t i o n e d to the system as well as the goal: a rent reduction. T h e phrase about the contract being for five years is translated to: (contract status fixed-period) and ( c o n t r a c t length 0 0 / 0 0 / 0 5 ) . ; During this interpretative step the operator is guided by t h e shell, which displays a n u m b e r of domain topics (i.e. the logical groups). By selecting a logical group like contracts, attributes within this group are listed. F r o m those attributes (due to the nature of logical groups all are contract related) the ~"Priorities are assigned to facts and indicate the quality of the information used to deduce the fact. Defaults which are based on weak statistical observations have zero priority. ~:PROLEXS distinguishes dates and periods, with notation dd/mm/yy. It provides operators like date+ (date, period), yielding a date, date- (date, date), yielding a period or date- (date, period), which yields a date. More operators exist as well as predicates like before, after, smaJler_th~, etc.
44
R . F . WALKER ET AL.
actual case description is built. In the following, such details about the user-interface will be left out.] Charlotte approaches the problem in the following way. She knows that the problems of rent reduction are dealt with in the so called landlord-tenant law. First, she wants to make sure that this act is really applicable in this case. Taking in consideration that the client actually lives in the house and pays a monthly rent, this conclusion is easily drawn, unless an exception arises. The landlord-tenant law contains one major exception: when the use of a house is short termed by nature, the act is not applicable. In a typical session, this rare exception will be ignored temporarily by the expert. [2. P R O L E X S , using the initial facts, (by means of its agenda handler) activates the logical group rent reduction, since this group contains more facts mentioned in the factual case description (initial facts) than any other group and it contains the goal as well. In this logical, group rule 1 (see appendix) is found by backward chaining, one of the reasoning mechanisms defined on the legislation knowledge group. Rule I is a logical choice from a backward point of view, because it assigns a value (YES or NO) to (rent reduction). Another possible candidate might have been rule 2, but this rule has been assigned a lower priority by the expert; this lower priority reflects the expert's opinion that a legal argument becomes weaker if a reference to (multi-interpretable) maintenance problems is included. After having used rule 1 successfully, backward chaining proposes to the agenda handler the logical group applicability-landlordtenant-law, since it needs rule 4, which is part of that group. Rule 4 is necessary to satisfy the first precondition of rule 1. The agenda handler complies with the request, due to successful history of backward chaining, and selects this reasoning method for the next cycle. By activating this new group, rule 4 becomes active and backward chaining can easily check the applicability of the landlord-tenant law. The exception mentioned in rule 4 (3rd precondition) is false by default.] To determine which part of the law is applicable Charlotte needs information about the type of house, for example whether it concerns a room or an apartment. Because of the exclusive entrance it seems to be an apartment and just to make sure she asks Joren if he has to share his shower and/or kitchen with neighbors. This appears not to be the case, so Charlotte is convinced that the house is an apartment. [3. Now, by rule 1, backward chaining needs to know whether the house is an apartment. This is derived from rule 5 by asking the questions associated with the preconditions. Since backward chaining has been making progress and does not propose any new logical groups, the agenda handier gives this reasoning method precedence over all others (that, in the meantime, are patiently waiting until their proposals are accepted).] Given the fact that Joren has just moved into the house, the term for rent reduction--three months after having signed the contract--has not yet expired, so formally reduction might be possible. Charlotte advises Joren not to wait too long before starting the necessary procedures, should reduction be possible. [4. Again by rule 1, backward chaining needs to derive YES as the value for (requirements formal satisfied). This is derived from rule 3 by asking the date of signing the contract and checking that the period between that date and the current date is less than three months. The procedure associated with (requirements formal
LAW AND ORDER IN A HETEROGENEOUS DOMAIN
45
satisfied) will be triggered: "You must make your request for a rent reduction before three months have passed after signing the contract".] At this moment Charlotte takes a look at case-law to check that the exception mentioned above (on short termed contracts) does not apply. She finds a ease in which a house was let for three years because the landlord was staying abroad during that time. The conclusion in this case was that the use of the house was not short termed by nature. Now she is convinced that in the actual ease, being similar to the case found in case-law, the use is not short termed by nature either. [5. At this point it might be instructive to reveal that any reasoning method can be of two possible types: intelligent or demon. Intelligent reasoning methods monitor their status themselves and submit reports (as described in section 4) to the agenda handler. Demons on the other hand will occasionally look at the blackboard at demon specific intervals regulated by the agenda handler. If none of the intelligent reasoning methods can make further progress demons are more frequently allowed to look at the blackboard. Intelligent reasoning methods are, by submitting the reports, able to direct the reasoning process. This behavior is not always appreciated. Consider the abstraction finder, the case-law reasoner (Walker, Zeinstra & van den Berg, 1989). Should it be allowed to direct the reasoning process it would try to convince the agenda handler to move to those logical groups in which it can apply its cases. This approach however was not the one the expert took. Rather would the expert confer to legislation and legal doctrine and only use case-law to decide the range of certain legal terms. Translated to the PROLEXS architecture, this implies that forward and backward chaining, both operating on rule based legislation and doctrine, are considered intelligent while the abstraction finder becomes a demon. As an aside, let it be mentioned that any reasoning method can change type transparently to the shell. The agenda handler now selects the abstraction finder, the case-law reasoner. The case-law reasoner compares the current fact situation as it appears on the blackboard with active cases in the case-law knowledge group. Active cases are those cases that are inside an active logical group. The abstraction finder uses a so-called abstraction network to determine which facts can be abstracted (X and YI in Figure 4) in the first place. If these facts are currently on the blackboard, a list is made of associated cases (C1 to C6) which are active (case l in the appendix might be illustrative). The cases on this list are compared facet by facet with the current fact situation as it is written on the blackboard. Every facet which is satisfied with respect to the blackboard is awarded an associated weight. If the case in the case base sufficiently resembles the current fact situation, indicated by the accumulated weights exceeding the threshold as associated with the case, the abstraction is made. More information on case-based reasoning is given in section 6.1.3 on the current PROLEXS approach, and in section 7 where neural network techniques as applied to weight assignment, case selection, and abstraction finding, that are about to be incorporated in PROLEXS are discussed. This example describes the current implementation however. Case 1 (see appendix) matches, since the sum of the weights of the satisfied facets ((landlord residence distance-from-tenant)=60, (contract length>= 00/00/05) accumulate to 110 points, which is above the threshold. Case 1 being satisfied, an abstraction is made from the fact (contract status fixed-period> to
46
R.F, WALKER ET AL.
(house usage short-termed-by-nature N O ) . The abstraction is written on the blackboard.] [6. The agenda handler selects forward chaining. This selection is justified by the fact that forward chaining has no requests for n e w knowledge, which is considered a positive selection criterion. Forward chaining fires rule 6, thereby writing its conclusion (applicability exception N O ) o n the blackboard. The truth-maintenance-system examines the possible conflict that arises since the fact (applicability exception NO) was already written on t h e blackboard as a default. Fortunately, both have the same value and further action is not necessary.] Next, Charlotte tries to retrieve information about the quality of the house which is necessary for calculating the so-called quality-index o f the house. This index determines the limits of the reasonable rent. Should t h e rent paid by Joren be outside this range an adjustment is obligatory. She can make a global estimation because she happens to know the street where the house is situated and the kind of houses that are found there. This assumes that the h o u s e is representative of the neighborhood. Using this quality-index, Charlotte estimates the maximum rent to be ft. 365. The actual rent (fl. 315) does not exceed this maximum so a reason for reduction has not been found yet. [7. Backward chaining, forced by rule 1 to derive the quality index, now proposes the logical group quality-index. This logical g r o u p contains rule 7, which is able to compute an approximate quality-index as a function of the zip-code, provided that the house under consideration is average. Recall that the zip-code was entered by the operator as an initial fact, translating t h e address as submitted by Joren. Relating adresses directly to quality-indices is something P R O L E X S is not able to do. Having received the proposal from backward chaining, the agenda handler must decide what to do. As was described in section 4, this decision is not exclusively based on the reports of all reasoning methods, but will also take into account the expert supplied strategy: meta-knowledge represented in the classification network. The network used in the example is shown in Figure 1 ( t h e much larger network in Figure 7 was not used for this example session). In this c a s e the network advises the agenda handler to check out the quality-index of the apartment since this must preferably be done before maintenance deficiencies are examined. Such deficiencies generally need a lot of interpretation and claims based o n them therefore are prone to be successfully countered by the landlord. Therefore t h e expert's strategy suggests to postpone activating the logical group maintenance deficiencies in favor of
quality-index. Incidentally, if none of the reasoning methods agrees with the suggested strategy kept in the classification network, this strategy is temporarily abandoned. It only guides the reasoning process along the lines an expert presumably would take. The control mechanism within the agenda handler is a heuristic, not an algorithm, and therefore not too easily described in simple rules. Nevertheless, it boils down more or less to the following: the agenda handier keeps selecting successful reasoning methods unless they: (1) have request for new knowledge nobody else needs (item 2 on the report as mentioned in section 4); or
LAW AND ORDER IN A HETEROGENEOUS DOMAIN
47
(2) indicate that continuing without switching to other reasoning methods would b e c o m e too inefficient (item 3 on the report). This situation might occur for instance if backward chaining is confronted with an expression with many unknowns. Instead of creating a huge goal tree for all possible values that would result in the desired outcome of the expression, it might be better to wait and see whether another reasoning method independently writes previously unknown values on the blackboard. Note as an aside that this is not too unlikely to happen since all reasoning methods are kept focused on the same small part of active knowledge (i.e. the currently activated logical groups). If the agenda handler decides to temporarily halt a reasoning method, it looks for another that: (3) is able to continue without allocation of new knowledge (item 2 on the report); and (4) agrees with the expert's suggested strategy in the classification network. If no reasoning method satisfies all rules, rule 4 is the first to be ignored. If still no candidates become available (there may be no request for new logical groups at all), rule 3 is also ignored and the reasoning method with the smallest expected workload (report item 3) and successful status (report item 1) is selected. If none of the reasoning methods have a successful status all demons are released. When they have finished and the blackboard has been changed, the reasoning methods are again polled since they might have switched status considering the new facts. If still none can be found, the reasoning process fails, and the agenda handler informs the user of its inability to satisfy the goals. (After that the agenda handler will invert the goals and try to reach the opposite of what the user wanted: if for instance boolean goal X cannot be assigned the desired value YES, it tries to derive NO. If the latter attempt succeeds, useful information has been gained). T o get back to the example at hand, where backward chaining and the classification network agree on the logical group (quality-index) to be activated next (according to Figure 1), the backward chaining proposal is accepted. Using the now active rule 7, backward chaining is able to approximate the quality-index, since (house average-state) defaults to YES. Backward chaining cannot proceed without allocation of new knowledge which would shift the focal point of PROLEXS. However, new facts can still be derived using the currently active knowledge. The agenda handler is aware of this since not all reasoning methods have indicated that no further progress can be made. Better than shifting the focal point of the PROLEXS inference engine, which is probably not what a human expert would do and which is confusing to the user, the agenda handler will select forward chaining. Forward chaining will fire rule 1: rent reduction seems to be impossible along these lines.] Joren, however, remarks that the house in question is exceptionally old and small. Charlotte therefore decides to calculate the exact quality-index by having the appropriate list, containing all the facets which determine the quality-index, filled in by her client. The maximum rent which is found now, ft. 320 does not justify an appeal for rent reduction either. [8. Temporarily interrupting the reasoning process, the fact (house averagestate YES) is removed from the blackboard by the operator (at instigation of the
48
R . F . WALKER E T A L .
client). The operator is always allowed to manipulate the current fact situation to reconsider previous answers, add new facts, or create W H A T - I F situations. The truth-maintenance system will now remove the fact {quality-index) as well since this fact was derived by rule 7 assuming that (house average-state} was true, which is no longer valid. In addition, rule 1 having used (quality-index) to deduce that (rent reduction } was not possible, will now be removed from a list of used rules, and be reinstated as a promising candidate to deduce the possibility of a rent reduction. It only needs to acquire
315.1 Another possibility to accomplish reduction of rent in Dutch law is by posing maintenance deficiencies of the house. If there are certain severe problems with the house the rent can be cut temporarily to the minimum rent limit. This minimum rent limit is determined by the quality-index, mentioned above, which apart from that is independent of maintenance problems. Severe maintenance deficiencies are enumerated by law so Charlotte can easily check whether or not the listed criteria are met in this case. If some of these problem descriptions apply to Joren's house, reduction of the rent can b e achieved on this basis. [9. Backward chaining, using the active rent-reduction logical group, has another possibility to reduce the rent by using rule 2, initially rejected in favor of rule 1. To derive a useful value for (rent reduction) the fact (maintenance deficiencies severe YES} must be derived. Rule 9 can derive a value for this fact by asking more details about the (possible) maintenance problems. Backward chaining proposes the logical group maintenance deficiencies again supported by the classification network which links rent reduction to maintenance deficiencies.] Charlotte, by questioning Joren, finds out that a number of deficiencies are
LAW AND ORDER IN A HETEROGENEOUS DOMAIN
49
present, but none of those appears to be mentioned in the enumeration. A rent reduction therefore is not possible. Nevertheless, she renders the advice to notify the landlord of the deficiencies in order to repair them as soon as possible. [10. The now active rule 9 cannot derive a positive answer for the fact (maintenance deficiencies severe) and as a consequence the goal (rent reduction) cannot be derived. On the basis of the answers to the questions as asked by rule 9, however, forward chaining is able to fire the active rule 10 within the logical group maintenance-deficiencies, which triggers a procedure telling that maintenance problems of any kind justify notifying the landlord, who should take care of them. Procedures are one type of so-called events. Events are triggered by the appearance of certain facts on the blackboard. The instance the fact (landlord notification) is written on the blackboard, the corresponding event is executed. In the case of a procedure, only text is displayed, but another event-type might initiate a questionnaire or calculate the value of a very complex expression that would be too inefficient to express in rules.]
6. The knowledge model Many expert-systems, dealing only with formal domain knowledge, probably accompanied by some rule coded expert directives, cannot explore the more heuristic approaches human experts often take. We claim that one must add a vertical component to the knowledge model, allowing for meta-knowledge, to incorporate judgemental and strategic knowledge in an expert-system shell. To support this claim the P R O L E X S knowledge model is discussed, in which knowledge is represented at different levels in an hierarchical order. Each level captures a different aspect of the domain knowledge. The first layer consists of so-called knowledge groups. Knowledge groups capture object-level knowledge. Normally, a distinguishable knowledge source as described in section 3.2, corresponds to a knowledge group in the knowledge model. This correspondence is not necessarily one to one. A knowledge source like legislation may use two knowledge groups: one filled with production rules, the other with legal definitions represented differently, for instance in frames. The current implementation however uses one knowledge group for each of the four knowledge sources. The second layer is formed by logical groups. The main purpose of logical groups is to establish a notion of relevancy. They structure the underlying knowledge group level. Finally, the strategic layer is a d d e d in the form of the classification network designed on top of the logical groups. The classification network represents meta-knowledge by supplying the shell with a kind of knowledge flow-chart. In the subsequent sections the following abbreviations are frequently used: RM
Reasoning method. This term is used interchangeably with "reasoning mechanism" or "inference method". An example RM is backward chaining. IE Inference engine: the pool of reasoning methods. This term is used if the article refers to all R M s available to the system. In reality however, all RMs are kept strictly separate and are dedicated to their own knowledge group. K R L Knowledge representation language.
50 KU
R. F. WALKER ET AL.
Knowledge Unit. Since PROLEXS uses several KRLs, it is sometimes convenient to refer to a unit in any KRL. A knowledge unit in the current landlord-tenant implementation can either be a rule or a frame. The range of possible KRLs is, however, about to be expanded (see section 7 about case-law representation using neural networks).
6.1. OBJECT-LEVEL KNOWLEDGE: K N O W L E D G E G R O U P S
The PROLEXS shell treats knowledge groups in an object-oriented way. Each knowledge groups has its own knowledge representation language (KRL), invisible to others, as well as its own dedicated reasoning techniques. In addition, but with respect to the subject of this paper less important, a knowledge group possesses its own editor (depending on representation) and knowledge acquisition and formalization methodology (Walker et al., 1989). At present four knowledge groups are distinguished in the legal domain: legislation, caseqaw, legal doctrine and expert knowledge. Together they constitute the formal domain knowledge. The expert knowledge group only contains the low level rules-of-thumb. Expert supplied recta-knowledge (e.g. a strategy) is not represented at this level, but higher up in the knowledge model hierarchy. The legislation knowledge group is a production system. It uses backward and forward chaining to operate the rules. Case-law on the other hand is found in the case-law knowledge group. This group employs frames and a dedicated reasoning mechanism called the abstraction finder to reason with cases. The abstraction finder works by detecting analogies between the case at hand and the cases stored in the case-law knowledge group. The reasons to use a number of distinct knowledge representation languages (KRLs) are manyfold: a powerful reasoning method (RM) like for instance backward chaining can only operate on production systems. The choice of a particular RM implies choosing its associated KRL. Since one is not absolutely free to use any RM, because some RMs may be better in mimicking expert-like behavior in certain situations than others, this will result in multiple KRLs. It is also intuitively clear that some KRLs closely resemble the nature of a knowledge source (for instance a case description might use a frame-like structure). This facilitates the knowledge acquisition since the KRL will be more familiar to the expert, whose knowledge must be transferred to the system,
6.1.1. Knowledge group communication Since the knowledge groups are independent of one another, sharing information only indirectly using a blackboard (Figure 2), it is possible to tune a K R L or dedicated reasoning mechanism without affecting the other components of the shell. One can add object-level knowledge at will, possibly using new KRLs if the existing ones cannot properly represent the knowledge. The process of adding new knowledge groups is trivial because P R O L E X S has expanded the notion of flexibility to the extent that the KRL and the reasoning techniques are both properties of the knowledge, rather than being part of the shell. Extended flexibility has been discussed in section 4 and Figure 3 illustrates this notion.
51
LAW AND ORDER IN A HETEROGENEOUS DOMAIN
Case-law
Legislation
RM 1 .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
RM 3 .
.
.
.
.
.
.
.
) FIOU]~E 2. Combining knowledge across knowledge groups.
At this point the question may arise as to what exactly is left for the shell itself. What is the function of the shell, if the KRL and the IE are part of the knowledge? First of all the task of the shell does not become easier if it cannot know the reasoning methods in advance. In PROLEXS, knowledge groups and their associated KRL and reasoning mechanisms can be attached to the shell without
Rule editor
Case editor
Knowledge
Knowledge
Acquisition
Acquisition
Methodology
Methodology
Inference engine
Inference engine
1. Backward chaining
1. Abstraction finder
2. Forward chaining
Rulebase
Casebase
Legislation, doctrine, and expertise
Case-law
(KRL = production rules)
(KRL = rules)
FIaURE 3. Object-oriented knowledge groups.
52
R. F. WALKER E T AL.
having to modify the shell. The PROLEXS shell therefore is independent of any KRL and IE provided that the following restrictions are met: (1) any reasoning method (RM) within the IE must r e a d from and write to a blackboard in a shell-specified format; (2) any R M can only use active knowledge; (3) any RM can only be operated by the agenda handler; and (4) any RM must submit a uniformly formatted report to the agenda-handler, which includes an estimation of remaining work, a request for new knowledge, and a progress evaluation. The first constraint takes care of interrelating the knowledge across knowledge groups. If a fact has been deduced by applying rules within the legislation knowledge group, it is written on the blackboard to be u s e d by other RMs. If a rule exists in knowledge group 1 saying that IF A A N D B T H E N C, then it might very weU have to wait for another knowledge group (more precisely, another R M ) to derive A and B and to write them on the blackboard, b e f o r e it can fire and write C on the blackboard. The general format not only allows for interrelating knowledge across knowledge groups, but makes it possible to have a KRL-independent truth-maintenance system and explain facility. Each fact written on the blackboard is associated with a record indicating which KU sent this information and which KUs have used this information. The explain facihty uses this information to answer questions like WIlY X (in a way beyond the scope of this article) and the TMS will have no difficulties to find dependencies among the facts on the blackboard. These dependencies of course are crucial to truth-maintenance since the TMS must know which facts have relied on fact X if fact X is to be removed. Restriction 2 ensures that the shell can direct t h e reasoning process by "activating" chunks of knowledge (as was dearly demonstrated in section 5). This so-called activation process will be explained in more detail in section 6.2. Restriction 3 eliminates anarchy. Not knowing about each other, all RMs would, once started, try to go on forever, searching for whatever the goals they are after. The agenda-handler keeps track of the status of each R1VI and constantly switches from one RM to another in an attempt to get to the goals as efficiently and in as expert-like a manner (see section 6.3 about expert strategies) as possible. The goals are reached if the desired facts have been written on the blackboard by some RM. The final constraint ensures that the agenda-handler gets full control over the reasoning process without knowing anything about the various RM's. All the agenda-handler needs to know is in the report. A description of the way these reports are interpreted can be found in section 5 at [7]. It is certainly not irrelevant where the KRL and RMs reside. As has been shown above, the administration involved with not knowing the RMs in advance is significant. So what is the gain? In our view there are several: (1) Knowledge groups can be developed independently from both the shell and from one another. This opens the way to an extensive library of knowledge groups (not necessarily filled with knowledge, but with their own particular KRL, IE, K R L
LAW AND ORDER IN A HETEROGENEOUS DOMAIN
53
dependent editors, knowledge acquisition and formalization methodologies etc.), to choose from for any new legal domain. Even third parties, having no knowledge about the PROLEXS shell at all could develop knowledge groups just by keeping in line with the above guidelines. (2) Knowledge groups can be optimally suited to a certain knowledge source; adjusting KRL or RMs will not affect the shell so it can be dedicated to the nature of the knowledge itself. (3) The maintainability greatly improves if knowledge from one group can be updated independently from others. Of course, the knowledge engineer has to take care of the fact that updating certain knowledge may affect knowledge in other groups. In the legal domain the updating of, for instance, a statute will affect the case law related to that statute. (It falls beyond the scope of this article to go into details about distributed consistency enforcing, the technique which PROLEXS provides to partly overcome this problem. It suffices to say that each knowledge group contains the KRL-specific part of the consistency enforcer (CE) and is responsible for answering queries from the non-distributed CE, which is part of the shell and therefore KRL independent. These queries are used to check which facts can be derived from a given list of facts with respect to the particular KRL of the knowledge group. The main CE combines this information to find inconsistencies across knowledge groups. Distributing the CE is necessary since the independent knowledge groups may be internally consistent, only to become inconsistent when used in combination with other groups.) It is admitted, however, that these observations are primarily interesting from a design point of view and have no real consequences with respect to legal reasoning in general. Having discussed the way in which the knowledge groups at the lowest level of the knowledge model interact, it seems plausible to discuss the actual choices of KRL and RMs of the knowledge groups as they are currently implemented in the landlord-tenant law application. The first and most sizable knowledge group is a conventional rule-based system. To ensure a certain degree of completeness it will be briefly discussed, but more emphasis will be placed on the case-based knowledge group. Rule-based reasoning has been the topic of many papers in the field and the PROLEXS variant is more extensively treated in Walker et al. (1989) and Schrickx (1990).
6.1.2. Rule-based knowledge representation: legislation, doctrine, expertise Part of the current PROLEXS domain, the Dutch landlord-tenant law, can be represented in rules elegantly. A PROLEXS rule looks like: IF relopl fact1 value/fact method1 relop2 fact2 value/fact method2 THEN
fact := EXPR where the expression EXPR may contain boolean and relational operators, constants (strings, integers, floats, dates and periods), and those variables men-
54
R. F. WALKER E T A L .
tioned in the condition part (this ensures that the expression can be evaluated with respect to the currently available facts). An example of an actual PROLEXS rule is shown below. The same rule is u s e d in the example dialogue described in section 5. IF (house usage living) YES ASK (rent periodical) void ASK (applicability exception) void D E R I V E THEN (landlord-tenant-law applicable) := (house usage living YES) AND (rent periodical YES) AND (applicability exception NO) FI The condition consists of a relational operator followed by a value or another fact. Examples are (>5) or ((=(contract validity date begin)). An exception is the V O I D condition, which checks merely for the appearance of the fact on the blackboard. The method is either ASK, DERIVE, R E S U M E or L O O K U P . R E S U M E indicates the method where the user is prompted for an input, but given the opportunity to have the value derived by the system. L O O K U P will query an external database for a value, for instance the maximum rent given the quality of the apartment. A rule compiler is used to convert rules like the one above to 'simple' rules without boolean expressions in the action part of the rule. For each possible combination of facts that produces a T R U E value for the expression, a separate rule is created. The rule above for instance uses two free boolean variables in the expression, since both (rent periodical) and (applicability exception) are declared VOID. This rule is therefore decomposed in four simple rules (one in case ( r e n t periodical YES) AND (applicability exception YES), etc.) each assigning e i t h e r YES or NO to (landlord-tenant-law applicable). The exact process is beyond the scope of this article but it multiplies the n u m b e r of "complex" rules by an observed average factor of three. These simple rules can be used much more efficiently. The knowledge engineer and the user however are completely unaware of this process, since each time the system refers to a compiled rule (for instance while explaining its reasoning) the uncompiled source will be shown. Two reasoning methods use the above described KRL: forward chaining (FC) a n d backward chaining (BC). They do not know about each other and are independently controlled by the agenda handler (see sections 4 and 5). The facts referenced inside the rule above are in the same format as those that appear on the blackboard. This is not a prerequisite though; if the format of the facts as used internally b y a knowledge group differs from the one on the blackboard, the knowledge g r o u p is required to include an interface. Using the same format eliminates that n e e d though, and effort has been put in specifying a general fact format that can be u s e d in numerous KRLs [such a general fact definition must for instance include a t i m e stamp, probability factor, priority, parent (knowledge unit which has written the fact
LAW AND ORDER. IN A HETEROGENEOUS DOMAIN
55
on the blackboard), children (knowledge units that have used the fact), etc]. The above KRL is suitable to represent legislation and expert supplied rules of thumb (see rules in appendix). Some legal doctrine (opinions of leading legal authors, governmental directives) can also be translated to rules. This KRL fails however in the example-based case-law knowledge group.
6. I. 3. Frame-based knowledge representation: case-law Case-based reasoning is used in PROLEXS to deal with open-textured concepts. Rules like if maintenance deficiencies are present then the quality of the house should be considered poor can only be used by experts (either human or artificial) if they have the experience to evaluate the applicability of open-textured concepts like maintenance deficiencies or poor quality with respect to a given case. Multiinterpretable rules like the above are frequently used in legislation. Sometimes words like "severe" or "significant" are added to make things worse. The PROLEXS abstraction finder (AF) uses hierarchically organized frame-like datastructures to represent relations between facts and abstractions. A fact can be abstracted to another (more abstract) frame via a number of cases. Within the case-law knowledge group it is possible to represent that lack of heating constitutes a maintenance deficiency if the case at hand "looks like" at least one of the cases to which the term maintenance deficiency has been proven to apply. The heuristic that searches for look-alikes in the case base will be discussed later. Figure 4 shows part of this knowledge representation scheme. Not only can fact X be abstracted to the more general applicable fact Y1 via cases C1 and C2, fact Y1 on its turn can be treated as a case of Z if the current fact situation matches C3, C4 or C5. The abstraction network in Figure 4 can also be used to abstract X to Y2, if C6 resembles C, the current case, close enough. Currently, the abstraction mechanism is fact driven: if X appears on the blackboard together with facts that support both case C1 and C5 then eventually Z will be written on the blackboard. The reverse process, in which the AF is requested to look for cases that support Y1 has not been implemented. The rationale behind this asymmetrical design is that for Y1 to be derived, the AF should check the resemblance of cases C1 and C2 to C, the case at hand. If C1 and C2 are not analogous to C at all, this implies a large number of abundantly irrelevant questions. The present solution to finding relevant cases is matching a number of conditions against the blackboard--for an example see section 5 at [5]. Each case in the
C1, C2
C6
FIGURE 4. Hierarchical abstraction network.
56
R. F, WALKER E T A L .
Tenant a g e > 6 5 Room
temperature
10 <13~
20
Season n a m e = " w i n t e r "
15
etc,
N
FrGURE 5. A simple PROLEXS ease-frame,
case-law knowledge group is stored with a set of such conditions, and each condition has a fixed weight, reflecting the degree to which satisfying the condition contributes to the relevancy of the stored case. The blackboard contains information about the current case: user submitted information, answers to questions, and derived facts. For each satisfied condition the case is awarded with the weight associated with the condition (which might be negative). If the sum of the weights of the satisfied conditions exceeds a case dependent threshold, the case is considered to match. Figure 5 shows an example of a PROLEXS frame, describing cases in which elderly tenants have trouble heating their place in the winter. A probable abstraction might be to YI -- maintenance deficiency. Z in Figure 4 might be severe maintenance deficiencies (which the landlord is obliged to eliminate instantly). The above approach however has serious drawbacks. For one thing, the expressive power of the representation of cases by weighted conditions and a threshold is limited. One can easily imagine that in a case like the above mentioned example, an age of 70 combined with a room temperature of 15~ should yield the same total weight as an age of 66 and a temperature of 12~ This could be solved by adding more conditions about age and temperature, but still the situation would be that the contribution of for example age to the relevancy of the case would remain constant as the age approaches some critical value, and then suddenly jump. An often better solution is to allow (weighted) scalar variables as well as boolean conditions in the representation of a case. In the example, age and temperature could be such scalars. Even so, only a linear combination of age and temperature can be expressed. It is conceivable that actually a very high age combined with an extremely low temperature should contribute more than the sum of the contributions of this extremities occurring separately. Another problem is that of credit assignment, which involves the difficulties of assigning weights to conditions (Rissland & Ashley, 1988). This imposes a dilemma: the problem is proportional to the expressive power of the formalism. Enhancing the expressive power results in a more serious credit assignment problem, while attempting to keep credit assignment manageable leads to an expressively weaker formalism. The credit assignment problem arises from the fact that an expert cannot give quantitative formulas to decide when cases should match. The expert can only
LAW AND ORDER IN A HETEROGENEOUS DOMAIN
57
give examples of easy matches, hard matches, near misses and total failures, with qualitative explanations. Using this information the knowledge engineer has to tinker with weights and thresholds to arrive at a legally acceptable solution. To overcome the dilemma a heuristic is needed for credit assignment in strong formalisms that can be carried out automatically. Since one can perceive PROLEXS case-frames as small feed-forward neural networks (perceptrons), the algorithm to train such perceptrons can be used. However, the reason why perceptrons were rejected as AI tools, was their insufficient expressive power. The recent revival of interest in neural networks (Rumelhart & McCleland, 1986)-~ resulted from the discovery of good heuristic methods to train expressively strong networks, and these are the methods now investigated for use in PROLEXS. An introduction to the current PROLEXS approach to case-based reasoning can be found in section 7. A final problem with the current case-based reasoner is that the system can either use very specific conditions (e.g. the number of rooms equals 6) that seldom apply, or very general conditions (e.g. the house is big), that unfortunately produce precisely those problems of open-texturedness they were supposed to solve. The PROLEXS methodology dictates that the expert should include instructive hypotheticals. Those hypothetical cases are general enough to apply to a large category of factual situations, yet specific enough to reduce the possible "interpretation space" to a minimum. 6.2. RELEVANCY: LOGICAL GROUPS
One level above the knowledge groups, PROLEX uses logical groups. A logical group contains knowledge from various sources related to a specific topic (Figure 6). The different topics are distinguished by the domain expert. Since this knowledge can be from different knowledge groups, the term knowledge unit is used to refer to a piece of knowledge in any KRL. In the present knowledge group configuration a knowledge unit can either be a rule (e.g. legislation, legal doctrine) or a frame (case-law).
Legislation
Case-law
knowledge group
knowledge group
FlOU~g 6. A logical group. t A thorough treatise on neural networks and their possibilities can be found here.
58
R. F. WALKER E T A L .
PROLEXS, after loading for example the landlord-tenant knowledge base, distinguishes logical groups like: maintenance, rent reduction or service costs. A logical group is filled with knowledge units carefully extracted from t h e object-level knowledge groups. That is, the logical group maintenance might use a couple of law clauses from the legislation knowledge group, some cases from the case law knowledge group, and some rules-of-thumb, distilled from experience and found in the expert knowledge group. A logical group is therefore representation independent. It uses an arbitrary number of knowledge units, therefore possibly different knowledge representations, implying on their turn several dedicated reasoning methods. A logical group does not know about reasoning techniques. A reasoning mechanism is only associated with a certain knowledge representation. If a logical group activates a knowledge unit, the corresponding reasoning mechanism is automatically triggered. A logical group can be in either one of two states: active or inactive. If a logical group is active then all of its knowledge units are active. If the logical group maintenance is active then a number of law clauses referring to maintenance from the legislation knowledge group is active, as well as some maintenance related cases within the case-law knowledge group and a few rules-of-thumb mentioning maintenance from the expert knowledge group. Only active knowledge units can be used to reason with by any reasoning method. This mechanism enables P R O L E X S to focus on a particular topic. This feature not only serves control purposes however, it also plays a crucial role in simulating expert behavior, as can be seen by the session trace in section 5. The decision which logical group to activate is made at a level above the logical group layer; the precise process has been discussed earlier in sections 4 and 5. 6.3. STRATEGY: T H E CLASSIFICATION N E T W O R K
The highest level in the knowledge representation hierarchy is formed by the classification network. For each domain a separate network is designed. The nodes within this network are formed by the logical groups. This network, part of which is shown in Figure 7, serves a number of purposes. First it is used to classify a problem, hence the name classification network. Since each knowledge unit (a rule for instance) is indexed on the facts it contains, it is easy to find out which set of knowledge units can operate on a certain initial input (i.e. the description of the problem). This set of relevant knowledge units may reside within a number of different logical groups. The higher the number of relevant knowledge units a logical group contains, the more relevant the logical group must be. In the classification phase therefore the activation flows upward from the initial facts, via the knowledge units, to a set of logical groups. The position of these logical groups within the classification network is the starting point in the inference process. From now on the activation process is reversed. Activation of a logical group is guided by a control mechanism (the agenda handler), that uses meta-knowledge from the classification network as one of its inputs. The activation subsequently reaches the knowledge units. This is the second and most important purpose of the network: supplying a knowledge flow-chart, which is used by the agenda handler to decide which topics to concentrate on. Section 5--at [7]----demonstrates this function (using the much simpler network depicted in Figure 1 though).
59
LAW AND ORDER IN A HETEROGENEOUS DOMAIN
Exceptions
~
mmmmliJmmmlHmm ~llrllrl
. . . . . . . . . . . . . . . . . . . . . . . . i ..... 2_J L
............
.
~
"'I~.
. .;. . . . . , F. . . .;. . . .
.
.
+. . . . . .
. ..........
.
,, 'L
t1 I
! I
ti ii
PJ'//P~ Rent reduction "/~r
!'!'!"!'!'!""~?i'i'~i???i'~i::~:'!:i:i:i:i:i"i r '" ........
Is-needed link
......
Is--obligatorylink Is-used link
F~OURE 7. A classification n e t w o r k for the L a n d l o r d - T e n a n t law domain.
The network topology is carefully designed by consulting the domain expert. It is our experience that such a network is a convenient way to express both dassi{ying and strategic knowledge of an expert. Since a connection between two logical groups implies a dependency, PROLEXS will probably first activate the logical group dealing with maintenance problems, when confronted with a maintenance problem (e.g. a leaking roof). Once the problem has been satisfactorily classified as a maintenance problem, the logical group involved with contracts may be activated, since maintenance requirements are often specified in contracts, indicated by a link between the two groups in the network. It is somewhat beyond the scope of this article to go into details about the different types of links that can be found in the classification network. Here it must suffice to mention that three types of links are distinguished, all of which appear in Figure 7. An I s - n e e d e d link from logical group X to Y means that (some of) the knowledge in group Y depends on (some of the) knowledge held in X (so that activation of Y should preferably be preceded by activation of X). An I s - u s e d link, an instance of which connects r e n t r e d u c t i o n to r e n t r a i s e , indicates that both groups are essentially independent but have a lot of facts in common (so that it may be
60
R. F. WALKER ET A L
efficient to activate both groups immediately after another since it is likely that much information can be deduced quickly because a lot of relevant data has been acquired earlier). The third type of link is called 1s-obligatory and is used for instance between applicability of landlord-tenant law and service costs. It is the expert's way of telling PROLEXS that service costs calculations are only meaningful if the applicability of the landlord-tenant has been verified first. The strategic layer in the form of the classification network models the expert's choice concerning the necessary knowledge to deliver the strongest legal argument. Using this network PROLEXS has a pretty good understanding of its own knowledge organization. Dependencies among chunks of knowledge, high level strategies and even lack of adequate knowledge can be properly modelled. Details about the various possibilities of creating links between logical groups can be found in (Schrickx, 1990).
7. Current research Current research focuses on the use of neural network in case selection, case abstraction and credit assignment. Experimental results have been promising and weight assignment in PROLEXS frames is currently being done by neural networks. As was described in section 6.1.3 and demonstrated in section 5---at [5J--the PROLEXS approach has been to compare an expert supplied set of representative cases with the fact situation on the blackboard. A case in the case-base is considered relevant if the sum of the weights associated with the satisfied conditions exceeds a certain threshold. Such a case-frame is shown in Figure 5. Figures 8 and 9 illustrate the straightforward mapping of the P R O L E X S case-frame of Figure 5 to perceptrons: the conditions correspond to input nodes, the weights assigned to the conditions become weights associated with the links from the input nodes to the output node. The output of a node is calculated by distributing its input over the outgoing links proportionally to their associated weights. If a node's input exceeds some threshold the node fires, thereby transferring its input to its output links in case of a non-terminal node. The output node should fire if and only if the case represented by the network is relevant.
Tenant age > 6 5 wl
Room t e m p e r a t u r e <13=C
w2 Relevant ?
Season n a m e = " w i n t e r "
FIGURE 8. Perceptron (boolean input).
61
LAW AND ORDER IN A HETEROGENEOUS DOMAIN
Tenant age
Room temperature Relevant ?
Season name
FIoum~ 9. Perceptron (scalar input).
The algorithms to adjust weights (e.g. the delta-rule) use a set of examples of desired output for given cases (training sets). They compare the desired output (is the case relevant or not?) with the real output as calculated by the network. If the output does not match the desired outcome, the weights are adjusted proportionally to the error (i.e. difference in calculated and desired output). The network will converge to a stable form in which the weights remain constant and each relevant case will correctly fire the output node, while irrelevant cases will not. [If too many diverse cases are presented or the cases contain a lot of inconsistencies (i.e. there is no underlying hidden rule) the network might not converge but oscillate between several states, not unlike the behavior of an expert confronted with inconsistent examples.] Two possibilities arise to set up this type of network: one can label the input nodes as shown in Figure 8. All inputs are boolean, so either the node is activated (if the condition to which the node corresponds is satisfied with respect to the current fact situation) or it is not. A more powerful approach is displayed in Figure 9, in which the input nodes reach over a possible range of values (age from 16 to 114 years, temperature from minus 40~ to 45~ and season over {"winter", "spring", "summer", "autumn")). The output of a node is usually a nonlinear function of its input: large negative input results in zero output, large positive input yields maximum output (i.e. 1-0); somewhere in between lies the threshold where output rises (steeply) from zero to its maximum. This non-linearity makes for a modest ability of perceptrons to reflect nonlinear combinations of input. Even though this ability is very limited (a proof of the perceptron's inaptness to represent "exclusive O R " ousted neural networks from AI research for years), it is greatly amplified when perceptrons are combined to form larger neural networks that use so-called "hidden layers", additional stages between input and output nodes (Figure 10). For these types of networks heuristic algorithms are also available (e.g. back propagation), that adjust the weights. A result of an experiment with three different training sets is shown in Figure 11. After the network was trained with one of those sets, a map was drawn by varying the age and room temperature while keeping other input constant. The color of the point defined by the values of age and temperature was determined by the output of the
62
R. F. W A L K E R E T A L .
Tenant age
Room t e m p e r a t u r e Relevant ? S e a s o n nalTle
Fmtn~ 10. Multi-layerednetwork. network. One training set contained examples from which could be deduced that for a successful match it was necessary and sufficient for both the age to be higher than 60 years and the temperature to be below 17~ The network responded correctly by firing everywhere in the rectangular area in Figure 11 and not firing in the white area. A second set defined implicitly a linear combination of age and temperature: the minimum age for a match was proportional to the temperature. The network made the proper generalization by firing only in the black area of the figure. Then there was a set where the influence of high age and low temperature amplified each other. The resulting behavior of the network is shown by the arc in Figure 11: only in the dark grey and black area the network fired. More complex patterns, not shown in the figure, were also trained successfully (e.g. elliptic shapes of the firing area totally surrounded by a non-firing area). It is noteworthy that the network sometimes makes unexpected generalizations. This means that after training, when the network gives all the right answers to the input patterns of the training set, it is necessary to let the expert test it to cheek it does not come up with strange answers. If it does, one simply adds the misjudged ease with the fight answer to the training set and let the network train again. 80
tU O~
60
50
0
17 20 Room temperature Flata~ tl. Neural networkoutput characteristics, key:[], linear combinationof temperature and age; I l l ' nonlinear combination of temperature and age;~ logical combination (age>60 ^ temperaturr 17).
LAW AND ORDER IN A HETEROGENEOUS DOMAIN
63
8. Conclusions and numbers The results of PROLEXS being installed at a legal clinic were somewhat disappointing. This was almost entirely caused however by severe shortcomings of the user-interface. (Due to a failed attempt to port PROLEXS to another operating system it became necessary to include the original user-interface, which was solely intended to be used by the developers and much to elaborate, in the prototype version at the clinic). However, since all sessions conducted by the students were recorded it was possible to run most cases again through the system, supervised this time by the developers themselves. It turned out that a substantial amount of cases could have been solved by PROLEXS had the system been used correctly. Nevertheless some problems remain. (1) It turned out the PROLEXS case-base was too small. Not enough cases were present and much of the rest was too specific. Including cases is not a trivial business because not only should the expert be able to "list" relevant eases, but the weights were to be tuned manually. This often led to cases that were either too specific or cases that were thrown out of the case-base because the weights could not be properly set. The weighting process now proceeds automatically though and cases can be stored in the case-base in much larger numbers. (2) Even the very extensive covering of the domain could not avoid the fact that some cases could only be solved using additional knowledge, sometimes from apparently independent domains. This seems to prove that full coverage of a heterogeneous domain in combination with all knowledge outside the domain but liable to be consulted in hard cases, is necessary for expert systems to live up to their full potential in such domains. As is hopefully demonstrated in this article, the PROLEXS knowledge model with its classification network and logical groups (sections 4, 6.2 and 6.3) is able to handle large heterogeneous domains without significant signs of degrading performance. The current knowledge base (distributed over four knowledge groups, 34 logical groups, and two classification networks), which contains 1127 rulest and 30 case-frames, might be evidence of that claim, but is not yet complete enough to really challenge expert performance. We believe however that including neural networks to overcome some of the problems with case-based reasoning and the addition of more knowledge from neighboring legal subdomains will ensure a fair competition in due time.
References Modelling Legal Argument: Reasoning with Cases and Hypotheticals, PhD thesis. Department of Computer and Information Science, University of Massachusetts. BERG, P. H. VAN DEN t~ OSKAMP,A. (1986). P R O L E X S , a user oriented approach. I n U . ERDMANN, H. FIEDLER, F. HAFT & R. TRAUNMOLLER, Eds. Computergestiitzte Juristische Expertensysteme. Tiibingen: Attempto Verlag. t This represents the number of compiled rules (see section 6.1.2). This number is more convenient to ASHLEY, K. D. (1988).
compare with other expert-systems than the number of uneompiled rules, since uneompiled rules can be made arbitrarily complex.
64
R. F. W A L K E R E T A L ,
GARDNER, A. V.D.L. (1987). An Artificial Intelligence Approach to Legal Reasoning, M I T press, Cambridge, MA: M1T Press. G~Ern.n,~, G., M o w a ~ y , A. & TYREE, A. L. (1987). Expert systems in Law: the Datalex project. In Procee&'ngsof the First International Conference on A1 and Law, p. 12. New York: ACM press. HAv~-Ro'nt, F., W A ' r E R ~ , D. A. & LEN,~T, D. B. (Eds.) (1983). Building Expert Systems, p. 5. Reading, MA: Addison-Wesley. LErrH, P. (1984). Cautionary notes on legal expert systems. Computers and Law, 40, 14. OSKAMP, A. & VANDENBERGHE,G. P. V. (1986). Legal thinking and automation. In A . A. MARiNe & F. Soccl NA'rALI, Eds. Automated Analysis of Legal Texts. Amsterdam: North Holland. OSKAMP, A. (1986). Expertsystemen en hun toepassing in her recht. Ars Aequi, Special Rechtsinformatica, 35, 692-705. OSKAMP, A. (1989). Knowledge representation and legal expert systems. In G. P. V. VANDENBERGrU:.,Ed. Advanced Topics of L a w and Information Technology. Deventer: Kluwer. Osr.Ar,m, A. (1990). Het ontwikkelen van juridische expertsystemen, Deventer: Kluwer. I~SSLArCD. E. L. & SKALAK, D. B. (1988). Case-Based Reasoning in a Rule-Governed Domain, p. 3. University of Massachusetts. RtSSLAND,E. L. & ASHLEY, K. D. (1988). Credit assignment and the problem of competing factors in ease-based reasoning. In Proceedings of the Case-Based Reasoning Workshop, DARPA, Clearwater Beach. R~L~A~X, D. E. & McCLELAND, J. L. (Eds.) (1986). Parallel Distributed Processing: Explorations in the Microstructures of Cognition. MA: MIT press. ScH~acr~x, J. A. (1990). PROLEXS-formalization, a way to enhance validation. In D. KRACrrr, C. N. J, DE VEY-MnSTDAGH& J. S. SVENSSON,Eds. Legal Knowledge Based Systems, an Overview of Critriafor Validation and Practical Use. Lelystad: Koninklijke Vermande b.v. SUSSmND, R. E. (1987a). Expert systems in law: o u t of the research laboratory and into the marketplace. In Proceedings of the First International Conference on Artificial Intelligence and Law: New York: ACM Press. SUSSr:_ND, R. E. (1987b). Expert Systems in Law. Oxford: Clarendon Press, W,~rmR, R. F, & BERG, P. H. VANDEN (1988). P R O L E X S , an object oriented legal expert system. Herrestad, H. and Maesel, D., Five Articles on A I and Law, Complex series 1988/5, Oslo. WALrmR, R. F., Z E u s - - , P. G. M. & BErto, P. t-I. VAr~DEN (1989). A model to model knowledge about knowledge. In G. P. V. VAHDENBr~RGHE, Ed. Advanced Topics of Law and Information Technology. Deventer: Kluwer.
Appendix The rules and the frame used in the example d i a l o g u e (section 5) are listed b e l o w . Each PROLEXS fact can be of type I N T E G E R , F L O A T , S T R I N G , B O O L E A N , D A T E or PERIOD. The preconditions consist o f a fact, a condition and a m e t h o d . A condition can either be an expression like (>80) or ( ( ) "leaking r o o f " ) , a constant like YES, or VOID. A VOID c o n d i t i o n means that any value will suffice. A method can be ASK, D E R I V E , R E S U M E or L O O K U P . The D E R I V E method means that the fact in question s h o u l d be derived from other k n o w l e d g e units. RESUME will first ASK for a value p o s s i b l y followed by D E R I V E if t h e answer is not known. A L O O K U P m e t h o d will look in a table. T h e zipcode/quality-index table for instance lists a q u a l i t y index for any zip-code. The action part of a rule consists of a fact w h i c h is assigned a constant or t h e result of an expression in terms of other facts. T h e keyword P R O C E D U R E in t h e
LAWAND ORDER IN A HETEROGENEOUSDOMAIN
65
action of a rule is used to indicate that the procedure associated with the fact will be fired. In reality however, events (one of which is P R O C E D U R E ) are not included in rules but are attached to facts. The moment a fact is written on the blackboard the associated event is triggered. RULES
Logical group: rent reduction rule 1 If (landlord-tenant-law applicable) YES DERIVE (house apartment) YES DERIVE (requirements formal satisfied) YES DERIVE (quality-index) void DERIVE then (rent reduction) := ((rent) > (quality-index) * 5,02)) rule 2 If (landlord-tenant-law applicable) YES DERIVE (house apartment) YES DERIVE (requirements formal satisfied) YES DERIVE (maintenance deficiency severe) void DERIVE then (rent reduction) := (maintenance deficiency severe) fi rule 3 If (date current) void ASK (contract date) void ASK then (requirements formal satisfied) := ((date current) date- (contract date)) smaller_than 00/03/00 (requirements formal satisfied) PROCEDURE fi Logical group: applicability-land-lord-tenant-law rule 4 If (house usage living) YES ASK (rent periodical) void ASK (applicability exception) void DERIVE then (landlord-tenant-law applicable) := (house usage living YES) AND (rent periodical YES) AND (applicability exception NO)
R. F. WALKER E T A L .
rule 5 ff (entrance exclusive) void ASK (shower shared) void ASK (kitchen shared) void ASK then (house apartment) := (entrance exclusive YES) AND (shower shared NO) AND (kitchen shared NO) else (house room) := YES fi rule 6 If (house usage short-termed-by-nature) void ASK THEN (applicability exception) := (house usage short-termed-by-nature) fi Logical group: quality-index rule 7 If (house average-state) YES ASK (house zip-code) void ASK then (quality-index) := LOOKUP (house tip-code) fi Logical group: maintenance deficiencies rule 9 If (maintenance (maintenance then (maintenance (maintenance OR (maintenance OR
deficiency) YES ASK deficiency description) void ASK deficiency severe) := deficiency description "drainage absoncC') deficiency description "stairs dangerous")
etc.
fi
rule 10 If (maintenance deficiency) YES ASK then (landlord notification) PROCEDURE fi
Frames Logical group: applicability-landl~
67
LAW AND ORDER IN A HETEROGENEOUS DOMAIN
Frame 1 Facets Facets (landlord (landlord (landlord (contract (contract (contract (contract
residence distance-from-tenant > 3000) residence distance-from-tenant > 300) residence distance-from-tenant < 301) length longer_than 00/00/04) length longer_than 00/00/02) length longer_than 00/00/01) length smaller_than = 00/00/01)
Threshold: 105 Abstract (contract status fixed-period YES) to (house usage short-termed-by-nature NO)
Weights 60 10 -10 6O 40 20 -20