Response to comments by Buchanan, Feigenbaum and Lederberg

Response to comments by Buchanan, Feigenbaum and Lederberg

Response 31 n Chemometrics and Intelligent Laboratory Systems, 5 (1988) 37-38 Elsevier Science Publishers B.V., Amsterdam - Printed in The Neth...

236KB Sizes 0 Downloads 45 Views

Response

31

n

Chemometrics and Intelligent Laboratory Systems, 5 (1988) 37-38 Elsevier Science Publishers

B.V., Amsterdam

-

Printed

in The Netherlands

Response to comments by Buchanan, Feigenbaum and Lederberg N.A.B. GRAY

Department of Computing Science, University of Wollongong, P. 0. Box 1144, WolIongong, N. S. W. 2500 (Australia)

THE DENDRAL STRUCTURE AS AN “AI” PROGRAM

GENERATOR

CONGEN

As a discipline in computing science, artificial intelligence (AI) is concerned with the explicit representation and use of “‘knowledge” for problem solving. The knowledge needed for any particular problem is obviously domain specific; AI researchers are concerned with general problems of knowledge representation and use. For some researchers the issue is simple - the only representation is logic, and all use must be through the inference procedures defined in the predicate calculus. Others, of a more empirical practical persuasion, utilise rules with confidence factors that are applied through production rule interpreters. The purpose of a particular AI program may be to configure a computer system, generate a precis of a newspaper story, or diagnose an infection the common element among such programs is the use of some inference system that can exploit extensive domain knowledge. How can one assess the AI content of a program? One can examine its use of domain knowledge. The knowledge needed for chemical structure elucidation is varied - one requires an understanding of chemical stability, information on the extraction and work-up procedures through which the compound was isolated, and information on how to interpret spectral and chemical evidence. I have looked in Congen to find how it uses such domain knowledge (the versions I chose to ex0169-7439/88/$03.50

0 1988 Elsevier Science Publishers

B.V.

amine were the 1978 BCPL implementation based on Carhart’s AMGEN algorithm and the 1980 version based on Carhart’s constructive substructure search algorithm - code that should be familiar to Feigenbaum and Buchanan). The program does indeed contain domain knowledge - there is a table of nine entries of the form “all graph nodes labelled ‘c’ have valence 4”, “all graph nodes labelled ‘0’ have valence 2”, and so forth. Apart from this use of domain knowledge, the code implements combinatorial algorithms. There are routines that encode algorithms that compare matrices to determine which is the lowest scoring on the basis of a given ordering criterion, there are routines that handle the problem of choosing all unique combinations of data elements chosen from some larger set. Some may consider such a program, based purely on combinatorial algorithms, to be an AI program and to possess knowledge (“it knows how to label a graph”). But on that criterion, a program implementing Hoare’s Quicksort could be an AI program (“it knows how to order data elements”). The notion of what is an AI program then becomes fatuous; the AI name can be accorded on the whim of the implementor.

USE OF KNOWLEDGE

IN THE DENDRAL

PROGRAMS

In addition to the core structure generator program, the various systems developed for the Dendral project possessed separate routines for spec-

n

Chemometrics

and Intelligent

Laboratory

Systems

tral prediction and, to a much more limited extent, spectral interpretation. These spectral interpretation and prediction systems are empirical, and consequently have only limited value in a structure elucidation system. Spectrum prediction systems can at best help identify the most plausible candidate structures and so assist the chemist in the choice of further experiments that might discriminate among the remaining candidates. These prediction routines use one-step inference - “if the molecule possesses this substructure, it will have a C-13 resonance at . . . (or show ions that result from the cleavage of bonds A and B)“. There are no long logical arguments to be pursued; one simply matches and asserts results. Performance depends solely on the size of the rule-library; implementation issues involve finding unique representations of the substructures, and providing fast look-up into the file of rules. Expertise becomes a glorified table-lookup. This form of expertise is closer to the model espoused by Dreyfus than to the model favoured by AI researchers. Of the various Dendral systems, only one made a systematic attempt to thoroughly exploit chemical knowledge at all stages. As illustrated in my Fig. 1, Heuristic Dendral employed a decision tree classifier in its planning stage and a simple rule-based spectrum filter in its test phase. It also possessed a heuristic filter in its generator. The paper “Heuristic Dendral; a program for generating exploratory hypotheses in organic chemistry” (B. Buchanan, G. Sutherland, and E.A. Feigenbaum, Machine Intelligence 4, Edinburgh University Press, Edinburgh, 1969, pp. 209-254) describes the heuristic extensions to Lederberg’s Dendral algorithm. These extensions implemented a “zeroth order mass spectral theory” that constrained the output of the generator. This paper states that “As each partition is generated it can be checked for plausibility before any attempt is made to generate the corresponding radicals. Each sub-composition is checked against the spectrum. If its weight is not present, the whole partition can be bypassed.“. Lederberg’s Dendral algorithm can be used as an exhaustive generator of acyclic structures and, as described in the comments by Buchanan, Feigenbaum and Lederberg, does sys-

38

tematically enumerate (acyclic) structures; but, the published description of the algorithm used in Heuristic Dendral is clear as to its use of heuristic mass spectral based filters.

THE PRESENTATION

OF THE DENDRAL

PROJECT

In the original scientific literature, the presentation of Dendral is correct. The Brown & Masinter papers describe the mathematical basis of graph labelling algorithms. Nourse’s papers discuss the data structures and algorithms required for the computerized representation and manipulation of molecular stereochemistry. The Carhart & Smith papers on Congen and Genoa describe the algorithms, and provide worked examples that illustrate how such programs might be used to assist in the solution of a structural problem. The various papers on spectral prediction note any inherent biases of a prediction model, and review problems associated with the employment of empirically predicted spectra as aids for focussing on a likely candidate structure. This scientific work has great merit, irrespective of any implementation the algorithms and use (or neglect) of the programs. The work can be assessed by others in the appropriate peer-group of computational chemistry. However, it is not the scientific papers that are read. Most people learn of Dendral through the AI literature. A standard text on building expert systems informs its readers that Dendral has caused a redefinition of the roles of humans and machines in chemical research; in the IEEE Expert journal, a member of the Stanford laboratory reports that Dendral (“based on a core AI concept, the use of heuristics”) is one of the most successful AI programs ever and can be found in many organic chemistry laboratories. In their textbooks, computing science undergraduates learn of Dendral as a practical AI tool in use in chemistry. I am concerned by these presentations of the Dendral project. Professors Buchanan, Feigenbaum and Lederberg do not share my concern. It appears that they cannot perceive any grounds for concern.