Artificial intelligence and statistics

Artificial intelligence and statistics

Reviews Artificial Intelligence and Statistics. Edited son-Wesley. 1986, $39.95, 418 pages. 213 by William A. Gale. Reading, MA: Addi- Of the s...

299KB Sizes 5 Downloads 311 Views

Reviews

Artificial Intelligence and Statistics. Edited son-Wesley. 1986, $39.95, 418 pages.

213

by William

A. Gale.

Reading,

MA: Addi-

Of the several dozen books on expert systems on my bookshelves, this book is one that I would stop to rescue during an emergency building evacuation. Reading this book altered many of my initial perceptions about the possibilities of expert systems applications in planning. It is, as the title implies, a book about artificial intelligence and statistics. It is also a richer work than the title indicates because it addresses two themes: (1) how statistics has and could influence developments in artificial intelligence; and (2) how artificial intelligence applications (particularly expert systems) have and could influence statistical analysis. The book is structured around these two themes. The book provides a clear example of two professions, developing relatively independently, with many common concerns. A sharing of information may benefit both disciplines. This book is not designed for a reader with little knowledge of expert systems. Readers unfamiliar with the large amount of jargon associated with AI should first read one of the introductory texts in this area. I highly recommend this book for anyone who has interests in applied statistical analysis. I also recommend it to the community of expert systems developers, who may find some very different ways of thinking expressed about some familiar problems. Artificial Intelligence and Statistics is a collection of 17 papers and an introduction. The papers were prepared for a workshop held in April 1985, convened by Editor William A. Gale. Gale provides a thoughtful review and summary in the first chapter. The next six chapters carry through the first theme of the book, statistics in AI, and the last 11 chapters carry through the second theme, AI in statistics. STATISTICS

IN Al

The first six chapters in the book look at ways that statistics may help AI. The chapters fall into two themes: (1) ways that uncertainty may be better incorporated into expert systems; and (2) issues of learning in expert systems. Statisticians spend a considerable amount of their time dealing with issues of uncertainty, and there is widespread consensus in the profession that probabilistic methods are best able to handle uncertainty. Expert systems developers spend a considerable amount of time designing systems which make decisions, often under uncertainty. In two chapters (Spiegelhalter, Fox) the authors look at how uncertainty has been handled in expert systems, and suggest some ways that statistical concepts and research might be better used in these AI applications. Spiegelhalter reviews the current procedures being used in expert systems to handle uncertainty: (1) theory of “endorsements”; (2) fuzzy reasoning; and (3) belief functions. He then suggests that many statisticians have considered the current treatment of uncertainty in expert systems as primitive, at best. A. F. M. Smith is quoted as saying the procedures selected have been “currently fashionable ad-hoc quantitative mumbo-jumbo.” The statisticians, not surprisingly, come down strongly on the side of using more probabilistic approaches. Countering this, the AI community questions the appropriateness, necessity and practicality of using probability approaches in actual applications. Spiegelhalter carefully argues that the probability approach does indeed provide a suitable representation of uncertainty, and, in fact, that this framework “seems the only means of answering these problems.” In the next chapter, Fox goes a step further with this argument by proposing a knowledge-based scheme for reasoning about uncertainty. Fox argues that his proposed

214

Reviews

scheme is consistent with other work in expert systems, but that it extends the traditional framework. He argues for a system which relies on a formal logical system rather than forcing the problem into a formal numerical solution. He develops a new vocabulary that allows a range of words (like “possibility”) to be used and argues that the movement toward quantification of uncertainty should perhaps be countered by a return to procedures structured around logic and opinion. In the next four chapters (Fisher and Langley, Hora, Salzberg, Phelps and Musgrove), the authors focus on what expert systems developers call “learning” issues and what statisticians generally call “clustering” problems. The most interesting points in these chapters are that the two groups of researchers have problems which are indeed similar but are called by very different names. As the reader goes through these chapters, it becomes increasingly clear that the two fields really do have common interests. The most striking examples are those related to the interpretation of graphics versus the interpretation of numerical results from statistical tests. Humans are very good at seeing patterns in graphic displays of data (e.g., clusters of points are identifiable, the need for a transformation is clear from a scatterplot), machines, to date, have had much more difficulty with this task. Fisher and Langley argue that many of the AI methods for machine learning are closely related to exploratory data analysis. They suggest that the main difference is that the AI problems generally deal with categorical data (e.g., symbolic data), instead of numerical data. They conclude by suggesting several approaches to “conceptual clustering” systems which are extensions to those of numerical taxonomy. Hora examines how statistical procedures and careful experimental design may be useful to AI researchers in evaluating the performance of learning machines. Salzberg describes the development of an expert system which “learns” through nine heuristics. The specific problem is that of predicting the winner of a horse race. The system is called HANDICAPPER. An example of one of the nine heuristics used is “unusualness.” Records from various races are used to examine features of winning horses. As more and more race results are entered into HANDICAPPER, the system keeps track of the predictions it makes which prove to be incorrect. The “unusualness” heuristic says that if a prediction fails, the feature of the predicted winner which is most unusual (occurs least frequently) is posited as the reason for the failed prediction. (The system, by the way, performs well, consistently better than real experts at the track!) Phelps and Musgrove compare statistical methods of clustering versus human ability to see clustering patterns graphically. They conclude that statistics should adopt more “non-standard” procedures, recognizing explicitly the human abilities to detect patterns. Al IN STATISTICS In the second portion of the book the authors examine how AI techniques, particularly expert systems, have been applied in statistics. The central theme in many of these chapters is the need for statisticians to be more explicit about the “statistical strategy” they use in dealing with applied problems which have a variety of possible solutions. Developing computer-based expert systems is an excellent way of forcing what has been an implicit analysis strategy into a more explicit framework. The chapters in the latter part of the book describe: (1) prototype expert systems applied to data analysis problems; (2) broader issues of the similarities of knowledge acquisition procedures used in expert system development to the procedures an applied statistician uses in solving data analysis problems for a client; and (3) the acquisition of explicit “statistical strategies” for data analysis.

Reviews

215

In three of the chapters (Gale, Ellman, Gale), the authors discuss expert systems which have been and are being developed at Bell Labs. One of the systems, REX, is an operational prototype expert system for improving the ability of novices to do regression analysis. The system provides a “front-end” to the use of the statistical package S. The front-end gives the user recommendations on transformations of the data which may be required. It distinguishes relationships between data which are “OK” (don’t need transformation), are “Mild” (may need a transformation), or are “Critical” (severe violation of linearity). The user is aided by REX through the use of multiple windows. One window displays graphics which may be helpful, another monitors the session and offers advice, another shows the interface to S. The system was developed under UNIX, making the multiple window environment central in its design. Further work is being done to improve the ability of REX to provide explanations of “why” to the user. Gale suggests that a weakness of many expert systems today is the quality and readability of the “why” explanations. A second system, Student, is being designed to improve on REX. The improvements are in the areas of explanation and generality. Student is intended as a system to be used and developed by a statistician, without the need for the statistician to program. The statistician’s role is to provide a statistical “strategy” for the solution of a particular data analysis problem. In a later chapter, Hand examines the possibilities for “higher-level” expert systems for doing more complicated analysis problems. Multivariate analysis of variance and discriminate analysis are used as examples. Transcripts of sessions with an applied statistician working with clients are used as a basis of discussion of such a system. Thisted systematically develops the areas of overlap between the practice of data analysts and knowledge engineers. He concludes that, “data analysts are already in part knowledge engineers; what they do on a daily basis is to elicit and to apply private expertise from experts in a ground domain, using a collection of techniques, strategies, heuristics, and tools for doing so.” Huber provides a more critical note, suggesting that there really may not be such a thing as an “expert” in exploratory data analysis. He suggests that exploratory data analysis is improvisation, and that the statistician may perhaps be best aided by a suitable programming environment which supports exploration. In two chapters (Butler and Corter, Brooking), the authors focus on the use of tools from statistics (including psychometrics) which may prove useful for knowledge acquisition in expert systems. Oldford and Peters further develop the theme of the acquisition of explicit statistical strategies. They conclude: “The idea, then, is to extract and record in software those ‘strategies’ that practicing statisticians employ in their analyses. If this can be done at all successfully, then the recorded strategies may provide input for the development of theory closer to actual practice. Minimally, the process should clarify the strategies themselves.” In the following chapter, Pregibon provides further guidance for statisticians who are attempting to develop explicit statistical strategies. The book ends with a chapter by John Tukey. It is, in typical Tukey fashion, dense with ideas about possible futures. One of his thoughts on the value of developing computer-aided analysis systems: “It will indeed soon be cheaper to do a million arithmetic operations than to pay for one second of an investigator’s or statistician’s time. So if a million operations will save a second’s worry, we should run them.” He suggests that such systems will not come up with single recommended actions, but instead will support our ability to provide multiple answers, and, for each of the multiple answers, a variety of reasons for what has been seen. LYNA L. WIGGINS

Massachusetts Institute of Technology