The desirefor easierinteractionwith computersmeansthat they willmodel human languageand use of knowledge
Naturallanguageand knowledge-basedsystems by JEREMY CLARE and NICK OSTLER
T
he idea of an intelligent computer system has been around for a long time. Until recently it has seemed to be a gleam in the eyes of academics and science fiction writers. However, recent advances in both hardware and in software development mean that it is now possible to develop computers which include aspects of intelligence. This article looks at two key areas of advance in terms of what has been achieved and how we can go forward to real applications. The two areas are natural language and intelligent knowledge-based system.
Natural language The phrase Lnatural language’ has been in widespread use by specialists in linguistics who use the term to mean the languages that humans use to talk and write to each other (Chinese, English or one of the many Abstract: Two new areas of compute app~~at~: natural language processing and intelligentknowledge-based systemsare discussed. Future applications and areas of development are highlighted. Keywords: natural language processing, intelligentknowledge-based systems,expert systems,computersoftware.
Jeremy Glare is a psychologist and Nick Ostler is a linguist in tbe human factors group, Scicon Ltd.
40
0011-684x/84/02004603$03.00
0
minority options), rather than the formal languages that have been invented to reason in, or program machines (mathematical formulae, COBOL, LISP etc.). Yet, now we are told there are machines that can handle natural language. What is the truth behind this claim?
Recognizingspokeninput of anything beyond isolatedwordsis stilla fundamentalproblem To save disappo~tment later, two points must be mentioned at the outset. First, recognizing spoken input of anything beyond isolated words is still a fundamental research problem; for the next few years the ‘natural languages processor’ will ignore the problem and take its input from written sources - keyboard or a text. Second, machines can at best simulate a limited repertoire of what people do with language. At the higher levels, it is very difficult to separate natural language abilities from the general human capacity for thought. Interpreting what someone must have meant, and constructing an appropriate reply, involves a very extensive mental model of the world, the other speaker and the conversation, and a quantity of co~onsense about what to do to them. These are not seriously implementable.
1984 Butterworth
& Co (Publishers)
Ltd.
To be worthy of the name, though, natural language processing must go beyond using English words in a programming language, or making sure that the order of arguments in an editor command does not conflict with the order of words in English. Users must be free to type in questions very much as they come to mind, and be free from any worries about how the computer has stored and processes the material it uses for its answers. The computer, like a good friend, should at least warn users when they seem to be straying from its realm of knowledge. All this is now possible. Compu~g doyens, such as Edgser Dijkstra, have pointed out that this implies a relaxation of discipline on part of the user, and warned of its effects; some have done experiments to show these sad effects in reality. But natural language has two great advantages over formal languages and these provide the key to its promising use in computing. First, every user knows a natural language. Casual use of information resources stored on computer systems will not come about until they are as usable as one’s local library, and much more usable than the technical libraries typically available to industry. Arbitrary decisions, necessary to the design of any formal system, mean that even those who are ‘computerate’ will never be able to avoid some learning frustrations when they try to
data
processing
software adapt themselves to a system new to them. This is why natural language front-ends to databases and information systems hold out such promise. Second, natural language, for all its weaknesses, is unequalled in picking out what is important in the complexity of the human environment; it does not need all the rigorous assumptions and boundary conditions needed to run formal models or apply formal languages. Hence, progress towards making machines capable of interpreting this economical expression of general thinking is ipso facto progress towards applying the power of computers in symbol manipulation to the problems of the real world, as they appear directly to its human actors. This is most directly seen in attempts to automate the higher levels of computer applications design. Software engineers developing requirements specifications have already noticed a major convergence of their interests with those of knowledge engineers working in artificial intelligence and the first joint conference was held in 1982. Medical applications are also prominent, whether to allow patients direct input to a diagnosis expert systems (as in MYCIN) or to allow automatic abstracting of medical research papers (a speciality of New York University). Inevitably, there is a more sinister side to this progress; security agencies will find it easier to integrate human intelligence data with that gathered from electronic or signal sources, thus enhancing their power: for better or worse.
Current research In Japan, natural language systems have been marked out as a theme for the second phase (1985-1989) of the ten-year fifth generation project. Research work currently centres on a few universities and research institutes, and the emphasis, unlike elsewhere in the world, is on text processing rather than interaction - specifically for translation between Japanese and English.
~0126 no 2
march 1984
The question is to resolve the difference between the mathematically exact, computationally expensive algorithmic approach and that of the expert . . . based on intuition, an elegant use of limited data and effective but inexact rules The USA is the major site for advanced natural language systems. Here, some institutes (most impressively, Bolt Beranek and Newman of Cambridge, MA and SRI of Palo Alto, CA) have over a decade of continuous development experience, resulting in general-purpose systems (RUS, Dialogic) which can be adapted to various subject domains and database systems. However, dialogue skills remain a weakness. Besides these ambitious programmes, there are dozens of other projects to develop special-purpose natural language interfaces. While these may produce impressive results in a small compass, they are not a likely basis for assaults on the central problems in the field. General-purpose database access systems are already beginning to be available commercially; the first are the closely-related Intellect of the Artificial Intelligence Corp., Waltham MA, and On Line English from the Cullinane Corp. At the other end of the price scale, it is expected that Symantec of Sunnyvale, CA will soon announce a cheap (under $SOO), if limited, natural language interface to run on a micro, to allow the user to add his/her own custom vocabulary. In Europe, there are a number of significant centres of research, though few of these can rival the best achievements in the USA. Among the most interesting is the research group at the University of Hamburg. One of their systems (Hamans) uses a single German language component to place hotel reservations, comment on the visual data from a traffic scene and access a fisheries database. In the UK there are pilot systems running at a
few universities, notably the University of Cambridge for database access. With such a small community, the pressing need is in fact for cooperation to produce service subcomponents, e.g. a large English grammar lexicon, which can be used as tools for the construction of systems with a more interesting scope. As a result, natural language research looms large in the centrally funded research programs both in Europe (ESPRIT) and the UK (the Alvey initiative).
Intelligent knowledge-based systems Intelligent knowledge-based systems (IKBS) is the new buzz word for expert systems. The need for a new name has grown out of the realization that the idea of replicating an expert in a computer system has, at present, major limitations. However, a less ambitious goal of replicating the expert’s knowledge base and using it intelligently is achievable. Before we become embroiled in the differences between expert skills and knowledge, it is necessary to consider why we should want to replicate an expert in a computer system. When computers became a reality much was claimed for their ability to solve problems and generally ease the lot of humankind. However, as computers developed it was realized that although they could manipulate large amounts of data in a variety of ways, they did not seem able to solve problems in an intellectually elegant manner. The trouble was that the problem could only be solved if it was expressed unambigously in a mathematical form and then if required a large
41
We can expectsteady progressrather than a majorleapforward amount of processing. Although, for many problems, this is a satisfactory state of affairs, doubts still exist since some problems cannot be resolved this way, The question is to resolve the difference between the mathematically exact, computationally expensive algorithmic approach and that of the expert which seems to be based on intuition, an elegant use of limited data and effective but inexact rules. The expert’s approach is seen as being based on rules of thumb, termed heuristics, and the manip~ation of knowledge rather than data. Even if we can build computerbased systems to replicate the expert, are they so necessary, given the enormous strides forward in increased processing power relative to both size and cost? There are four areas which can be identified where such systems would be needed: Remote systems, e.g. space probe undersea inspection, where size and power constraints are a major factor. Quick response systems, e.g. process control, military command and control, where time does not allow for exhaustive processing. The algori~~cally intractable problems, e.g. legal analysis, medical diagnosis, where the problem cannot be reduced to a mathematical expression. User comprehension, e.g. fault finding, planning, where it is necessary for a user to make decisions on incomplete data. So far there have been a number of IKBS implementations. However, all of them have been done from a purest
42
point of view, i.e. a problem has been defined that exemplifies some theoretical issue of knowledge representation. The most practical applications have concentrated on diagnosis either in the medical or geological fields, e.g. MYCIN, PROSPECTOR, because they exercise these theoretical issues without requiring other complex processing. In dealing with the wider application of IKBS solutions, it is necessary to consider the requirements that will be imposed in various situations. From a consideration of the example situations used above it is clear that there is no unique set of requirements. Thus, it is necessary to examine the purpose of each application, the constraints on its operation and reach a definition of the specific requirement. It is most likely that an IKBS solution will be considered only in complex applications where purposes are multiple, and the user interaction extensive (even in the remote autonomous system where the interaction will be permission). An analysis of the various applications shows that some of the fundamental problems of wide usage include the following:
importance and relevance of information changes as time passes. There need to be significant advances in how we describe and formalize expertise. The ability to design an IKBS does not mean that we will necessarily include useful or valuable expertise. In many cases, the expertise contains an element of intuition. By their very nature such intuitions cannot be made explicit by simply asking about them. It is of note that if expertise could be formalized simply all we would have to do is to read the relevant text books. Given these set of problems what can we expect from the real application of IKBS solutions? None of the problem areas are insurmountable, and neither is their full solution necessary for all applications. What we can expect is that there will be steady progress rather than a major step forward. For that steady progress to occur, the potential users of such systems must understand what they can expect, what the benefits may be and, thus, create a demand. cl Scicon Ltd, 49 Berners Street, WIP 4AQ, UK. Tel: 01-580 5599.
London
Real-time reasoning - in process control or military command and control, it is important that decisions are made in a timely fashion. Thus, it is necessary that the reasoning process is controlled by a knowledge of how much time is available. It is important to be able to interrupt the reasoning process to get a quick answer as a result in a change of circumstances. It will be important to mix reasoning using knowledge with algorithms which produce precise solutions; there is no need to throw away precision in the cause of intellectual elegance. There will need to be direct interaction with the real world. This may not seem difficult, but we need to deal with the fact that the
data processing