Book Reviews
T. Kohonen, Associative Memoty: A System-Theoretical New York, 1977, 163 pp. + indexes; $21.20.
Approach, Springer,
This book is a significant step in the right direction, in a field where such steps are not easy. This field, which sits firmly between neural modeling and artificial intelligence (AI), attempts to abstract general organizational principles from the structure of the brain. Such an inquiry is in the AI direction from neural modeling because it does not model neural structure specifically, but rather the general principles underlying neural structure. Yet while not specifically neural, the field still models information processing as done by physical network structures. It is thus in the modeling direction from the purely computational approach of AI. The material presented is not an isolated contribution. Rather it brings together and builds upon the approaches that different disciplines have taken to associative memory. Each of these approachesAomputationa1 structures, algebraic mappings, and neurophysiology-is developed from the ground up, yet in a concise, understandable manner. The secret of the conciseness is that each discipline is developed specifically as it contributes to associative memory. The book is thus truly interdisciplinary, as distinguished from the more usual pedagogical methods of books that present each discipline conventionally and let the reader do the real work of ferreting out each discipline’s contribution to the interdisciplinary topic. Those tutorial sections of the book which develop background material from each discipline should be particularly useful in graduate course work. Selections from the book could fit nicely into a neural-models course, making connections to a more abstract information-processing approach. Similarly, in an AI or pattern-recognition course, the book could make connections to the way biological systems perform pattern recognition. Two of the underlying disciplines, computational structures and algebraic mappings, are introduced in Chapter 1 and developed in Chapter 2. The first chapter introduces the crucial distinction between local and distributed memories, and pages 1 I-12, 17-19, 51, and 69-71 deserve special note for communicating an excellent intuitive feel for distributed processing. Altogether, four computational structures are discussed: local computer memories using conventional addressing or using content addressing, and distributed memories using holography or using physical networks. The author shows how associative memory is implemented in each case. In conventional computer addressing the implementation relies on the software technique of hash coding; in the latter three cases the implementation is in the hardware structure itself. The strengths and weaknesses of each of the four computational structures are carefully evaluated, so that one understands why the network structure may best meet the requirements of human pattern recognition. 333
334
BOOK REVIEWS
All of the linear algebra used in the book is developed in one formidable section at the end of Chapter 1. A basic understanding of linear algebra is essential for understanding how things work in the rest of the book, but the intuitive reader might get away with skimming pages 41-50 and taking subsequent derivations on faith. (Equations throughout the book are usually surrounded by enough explanatory text to allow the reader to understand their general form and meaning without picking them apart in detail.) The last half of Chapter 2 develops the algebraic mappings which describe the type of information processing done by physical network structures. One problem to be solved in the design of a distributed network memory is the requirement that each learned input lead to its appropriate output with minimum crosstalk to other outputs: the paired-associate problem. Another problem (and a big advantage for distributed systems when it is solved) is the equivalence-classing of any input into previously known inputs, with error-correcting properties that are preserved in the output space. The author defines an optimal linear associative mapping as one (realizable in a physical network) in which the solution to each of these problems is either exact or optimal in the sense of least squares. Section 2.3.8 is an excellent presentation of the relationship between this type of mapping and the statistical techniques of linear regression and linear estimation. Near the end of Chapter 2, conditions are presented under which the optimal linear associative mapping is simply the observed correlation matrix between the input and output components. This is the first clue that one might obtain the coefficients of the optimal network without having to explicitly solve its defining equations. The alternative idea-that each element of the network might automatically adapt to new observations in such a way that optimality is preserved -is demonstrated in Chapter 3. The author begins with some easily understood special cases, then derives the adaptive result in general. He also demonstrates a very fast adaptive process for novelty filter networks, a special case (introduced in Chapter 2) which may have practical applications. In the first two sections of Chapter 3, the text is quite helpful in translating mathematical results into a real understanding of how the adaptive network works. The final section is not concerned with understanding a particular adaptive system, but instead derives a theoretical upper limit for the speed of adaptation to arbitrary new observations. In the fourth and final chapter, we see how the principles developed in the first three chapters may be implemented in the structure of the brain. At this point, some formal approaches turn into stretched analogies that don’t really explain how things work; but the formal network model seems to explain distributed neural processing quite naturally. Of course, the author is careful to point out that his model is only a hypothesis depending upon some unverified assumptions. Nevertheless, in this field it is an important step to show a mechanism which can at least account for the known capabilities and limitations of human information processing. The author has carefully compared his mechanism against others (molecular coding, mental holography) in these terms. He presents a short but convincing argument that the network mechanism best accounts for the known properties of human pattern recognition. (Readers
BOOK REVIEWS
335
desiring a more complete analysis of the psychological properties explained by network structures might consult Anderson’s’ analysis. Anderson makes significant connections from a similar algebraic network model to the psychological literature, showing how the theory can account for specific psychological findings and predict new experimental results.) The chapter contains tutorial sections which clearly explain the essentials of neural anatomy and physiology. In a conventional approach to this material, the cortical-damage studies by Lashley and others’ and the implied concept of distributed memory. usually seem a bit mysterious. Kohonen, on the other hand, has only to translate the biological components into the general network structure analyzed in earlier chapters. If the reader has followed those chapters, the operation of distributed memory in such a structure is not so mysterious after all. BRUCE WHITEHEAD Systems Science Institute Uniuersi@ of L4misviIIe Louisville, Kentucky
‘J. A. Anderson, Neural models with cognitive implications, In D. LaBerge and S. J. Samuels, Eds., Basic Processes in Reading: Perception and Comprehension, Erlbaum Associates, Potomac, Md., 1977. ‘Reviewed in R. F. Thompson, Row, 1975, Chapter 11.
Introduction
to Physiological
Pvchologv,
Harper &