ARTIFICIAL INTELLIGENCE
197
Book Review Hayes-Roth, R. and Waterman, D. A., Pattern-Directed Inference Systems. Academic Pre~s, New York, $33.00, £21.45. In May 1977 a week-long workshop was held which gathered together most of the researchers concerned with what the organisers called Pattern-Direc~.ed Inference Systems. This book contains a selection of 28 papers from the workshop, bracketed by an Introduction and a Conclusion written by the editors with the assistance of Doug Lenat. (The papers presented at the workshop which are not included in this volume have already appeared eIsewhere [2], as has Lenat's incisive and spirited report of the workshop itself [3].) In practice most of the papers turn out to be concerned with a particular species of pattern-directed systems known as Production Systems. A Production System (PS) is essentially a program expressed as a collection of antecedent ~ consequent rules. It differs from other rule-based systems, such as those derived from the predicate calculus, primarily in that an interpreter (or executive) is explicitly included as part of the system, making the operation of the PS deterministic. One could say that whereas the rules of predicate logic list permissible inferences, those of PSs are obligatory. A strong flavour of applied work permeates the book, and its effects would appear to be wholly beneficial. Those papers which describe particular project,.. tend to come across better than those which discourse about PSs-in-general. Most of the tasks tackled--geological prospecting, speech understanding, dialogue handling--come complete with real-world constraints. The result is that the occasional paper dealing with an abstract or toy problem jars one by its apparent triviality. In addition this orientation toward application gives the papers a degree of coherence. Despite the divergences which Lenat [3] reports, there are wide areas of agreement. One example is the shared opinion that the strength of PSs lies in the modularity of their rules, and that this makes them well suited to tasks in which the knowledge embedded in the rules is acquired piecemeal, either from a human expert or automatically from another program. Thus PSs are not generally intended for "implementing algorithms" and are not to be judged by the criteria normally appropriate for evaluating programming languages. This agreement is illustrated by the fact that the epithet "hand-crafted", bandied about at the workshop to refer to programs constructed by dint of much human ingenuity, was unanimously adopted as a term of abuse ! Actificial Intelligence 12 (1979), 197-202 Copyright © 1979 by North-Holland Publishing Company
198
BOOK REVIEW
The one Bad Thing about the book is the grouping of the chapters. The papers have been sorted by inconsistent criteria, resulting in non-exclusive categories such as Deductive Inference, Natural Language Understanding, Multilevel Systems, and so on. The effect of 28 highly technical chapters arranged in moreor-less arbitrary order is somewhat bewildering. Although the editors' Introduction and Conclusion are workmanlike pieces which achieve what they set out to do, the job of unearthing the main strands of the book and linking the papers together is left entirely to the reader. In this review I will try to assist him in the task by offering some guidance on what I see as the main themes and mentioning a few of the connections between them. Individual chapters will be discussed under the headings of Expertise, Learning, Purity, Theory, Technology and Suppleness, in that order. A striking feature of some of the programs described is that they exhibit (or are designed to exhibit) a high level of performance on "real" tasks. Most of the credit for this Expertise must belong to the group of workers at Stanford centered round Feigenbaum. Their DENDaAi,program for interpreting mass spectrographs and MYON for advising on the treatment of bacterial infection are both well known. In this volume the tradition is continued in the papers by Duda, Hart, Nilsson and Sutherland on an "expert" program for geological prospecting, and by Nil and Feigenbaum on a pair of related programs for interpreting protein X-ray crystallographic data and what for "security reasons" are referred to as "continuous signals produced by objects". Both are appropriately impressive. It is pointless to try and distinguish between Expertise and Learning, since the Stanford approach to building an expert program is to view the task as one of transferring knowledge from human experts into a growing program (a theme expounded by Feigenbaum in [1]). Davis makes the connection explicit in his discussion of TEIP~V, SIAS, a program which "facilitates the interactive transfer of expertise from a human expert to the knowledge base of the system", though in the opinion of this reviewer the topic of "knowledge acquisition" is too much dominated by the discussion of data management techniques similar to the work one associates with Sandewall. We also find Buchanan and Mitchell explaining meta-D~D~r, the program which has the task of improving DFNDRAL'S performance, as a heuristic search among sets of rules guided by a strong theory of the chemistry of the domain. Waterman tackles the problem of having a program acquire the ability to serve as a computer user's "agent" by generalising appropriately from simple exar,lples of routine operating procedures. The task is a modest one and allows Waterman to utilise effectively the clear and elegant techniques developed in his own earlier work on "adaptive" PSs. A paper by Vere on inductive learning is the solo representative here of a highly respected tradition, that of seeking to understand a technique by subjecting a simplified version to formal analysis. Vere's technique manages to recover the three blocks world operations (moving a block from on top of another block to the table, from the table to a
BOOKREVIEW
199
block, or from one block to another) from a sequence of snapshots of 1S consecutive situations. Vere notes and bemoans the fact that the majority of PSs employed by other workers do not possess mathematical tractability. But the contrast between the constraints Vere places on his rules and the devices that other workers find to be necessary suggests that a long time will be needed before the gap between the practical and the analysable can be bridged. One reservation about this work: Vere's technique can generalise only over operations which result in exactly the same number of additions and deletions of terms in the predicate-logic description of the situation. It is not obvious that this simple constraint alone would not be adequate as a basis for recovering the three classes of operations. The next three themes are closely inter-related and need to be considered together. Purity was one of the themes dominating the workshop. Roughly speaking, a PS is "pure" to the extent that its characteristics accord with a notional "spirit of PSs", meaning such things as having short simple conjunctive antecedents, having rules being opaque to other rules, using PSs for the whole of a program rather than just a component, and so on. Only two papers deal with Purity directly, and as an actual description of anyone's position, it is a barely recognisable caricature. Lenat and Harris define the issue and urge that in writing a highperformance program, PSs should be used only for those parts of the program for which they are best suited, an argument which few would seriously disagree with. On the other side, the paper most committed to Purity is the one by Rychener and Newell describing their plans for an instructable [sic] PS. By this they mean a system which learns from a teacher who knows nothing of its internal workings, though this is a little hard t¢, square with their comment about relying "on the closeness of external language) expressions to internal forms". Without disputing the achievements of the Stanford approach, the authors believe that the way to build a system which can be instructed to pull itself up by its own bootstraps is to work with highly pure PSs. The paper is rather puzzling in this respect. No reasons are offered for the belief (though such reasons exist and derive from essentially psychological considerations), so that Rychener and NeweU effectively put themselves out on a limb, gambling on the eventual success of their program to vindicate their approach to an extent that is unusual even in such a suck-it-and-see discipline as AI. The true significance of Purity lies in its close relation to the individual researcher's view of what his work with PSs is really about. The key question seems to be whether the working program, containing or consisting of the PS, itself has status as an object of interest, or whether it is seen as a theory or model of some other process. This other process may be psychological, but is not necessarily so. And here, in the opinion of this reviewer, is where the Great Divide really lies. On the side of Theory, we find Schank and. Wilensky admitting that for understanding stories "knowledge in a form other than a script is necessary", and that
200
BOOK REVIEW
this other form must include information about the characters' goals. How this chapter strikes the reader will probably be a direct function of how he feels about the rest of Schank's work. Faught attempts to model the patterns of dialogue found in psychiatric interviews. His paper is especially interesting for its analysis of multiple co-occurring strands of behaviour with different patterns overlapping and co-existing at different levels, and for its attempt to grapple with the multiple causation of behaviour. His interpreter tries to find an action which simultaneously satisfies all the active patterns. The outcome can be the modulation of one aspect of behaviour by another (a topic touched on also in Joshi's paper, discussed below) as when one answers a question, but angrily. The subject of Mostow and Hayes-Roth's model is itself an artificial system, the nEARSAY-IIspeech recognition program (a program incidentally whose influence, both acknowledged and unacknowledged, is felt throughout the book). What these authors are telling us, in effect, is that the appropriate way to understand the syntax and semantics component of HEARSAYis in terms of PSs, even though for technical reasons it is implemented differently. Like natural systems, HEARSAYis sufficiently complex that Theory is needed in order to explain it. Not surprisingly, several of the papers deal with what one might call the 2 echnology of PSs, such as techniques for getting them to run faster. A paper by McDermott and Forgy analyses the advantages and disadvantages of various methods for conflict resolution, i.e. for selecting which rfile(s) to fire from among the many which may be applicable at any instant. McDermott and Forgy recommend mixtures of methods to yield better characteristics than can be provided by any one method alone. This paper implicitly takes a strong Purist position, and was criticised by people from the Stanford group on the grounds that ;.ts techniques are too "syntactic". What is needed, they argue, are "semantic" methods in which choices between conflicting rules are resolved by recta-rules which embody (meta-) knowledge about strategies, and so on, as outlined in Davis's paper. Another paper from the same stable, by McDermott, Newell and Moore, analyses the performance of various simple indexing schemes for speeding up a class of PS interpreters. There is nothing particularly subtle about it, but the clear statement of assumptions, the explicit development of the mathematical model, the comparison with empirical data and the exploration of the discrepancies, all combine to make this an outstanding paper. Together these two papers indicate a considerable degree of maturity in the Technology of this class of PSs. The question of efficiency highlights the distinction between the Theory position and its opposite--what one might call self-sufficiency. For those with a "theoretical" position the issue of run-time efficiency, while an interesting and possibly important practical matter, is a question of implementation and is quite distinct from questions concerning the inherent suitability of PSs as a representation. On the other side of the fence, however, this distinction tends to get blurred, and the effect in some cases is a curious situation where the author of a paper on implementation
BOOK REVIEW
201
efficiency seems to be unaware that that is what his paper is about. Needless-tosay, such papers are less successful than those which recognise their topic for what it is. Thus we have the paper by Rieger, which appears to be about semantically based inference but is in fact concerned with improving efficiency by the use of indexing structures similar to those discussed by McDermott et al. Similar comments apply to the paper by Zisman, who proposes the ingenious idea of using a Petri net to select out just a subset of the rules to be potentially active at any one time. And again with Riesbeck's paper on expectation-driven understanding, the intended status of the proposed indexing mechanisms is unclear. Are they to save time on the PDP-10, or are they actually meant as part of the conceptual theory ? Suppleness is not a good term, but I cannot find a more satisfactory one to express the belief that there is something too rigid, too cut-and-dried, about our present way of using PSs. Two papers focused on the all-or-nothing nature of the antecedent as the culprit. Joshi, in his paper on inference from partial information, allows rules to apply whose antecedents are only partly satisfied. His interpreter finds a minimal set of such rules which together "cover" the data base, and then infers the union of their consequents. Joshi stresses two effects of doing this. First, which rules fire on a given cycle depends on the entire set of rules and their implicit relationships, rather than being a property of individual rules. Second, the multiple responses can be interpreted in terms of one response being modulated by another, as in the paper by Faught discussed above. Hayes-Roth, in a paper which is a notable exception to our earlier generalisation that the essay-like papers are relatively unsuccessfui, discusses with admirable clarity the importance and difficulty of the problems of partial matching and best matching. There is, however, a flaw in the logic of his paper. He sets out by confronting directly the question of Suppleness: How can action "be influenced by many sources of knowledge without being dominated by any" .9But this issue is lost sight of in the subsequent discussion. A "best match" commits one particular rule to firing just as much as does all-ornone matching. One notable omission from the book, surprising in view of the historical origins of PSs in AI, is the lack of any serious use of rule systems for psychological modelling. There is indeed a section on Cognitive Modelling, but of the three papers in it the first two, by Barbara Hayes-Roth and Perry Thorndyke respectively, are of at best tangential relevance to the topic of the book, while the third is simply another partial report of the Rutgers B~LmVER project to add to the several already available. There are also plenty of allusions to human behaviour,,cognitive theory, and the like scattered throughout the book, but these are invariably in the form of offhand, throwaway remarks added to the text like so much seasoning to the meat. Buchanan and Mitchell for example, in their (otherwise excellent) paper, describe the division of their program into five subtasks, and then add: "which we feel are closely analogous to this aspect of human problem solving". This assertion is made without evidence or argument, and is not referred to again.
202
BOOK REVIEW
But this objection need not worry the majority of the people for whom the volume is intended. For a collective work of this kind an unusually large proportion of the papers are of high quality. At least 14 of the chapters I have discussed individually are well worth reading. Most of the vices which plague AI writing are successfully avoided. No one wafltes about Fuzzy Reasoning or Catastrophe Theory. There is, inevitably, the usual mixture of reportage of working programs and wishful thinking about systems yet to be implemented, but the fantasy is kept well under control and the differentiation is sharper than is often the case. Altogether the book depicts well the State of the Art in an exciting and rapidly growing area of Artificial Intelligence. RICHARD YOUNG MRC Applied Psychology Unit 15 Chaucer Road Cambridge CB2 2EF, U.K. REFERENCES 1. Feigenbaum, E. A., The art of artificialintelligence:I. Themes and case studies of knowledge engineering, Proceedings of the Fifth International Joint Conference on Artificial InteUigenee, Cambridge, MA. (1977) 1014-1029. 2. Hayes-Roth, R. and Waterman, D. A., Proceedings of the Workshop on Pattern-Directed Inference Systems, SIGART Newsletter (Specialissue) 63 (June 1977). 3. Lenat, D. B., Pattern directed inferencerules the waves,AISB Quarterly 28 (October 1977)8-13. (Reprinted in SIGART Newsletter 65 (April 1978) 1-5.)