203
BioSystems, 13 (1981) 203--209 © Elsevier/North-Holland Scientific Publishers Ltd.
AN A R T I F I C I A L COGNITIVE MAP SYSTEM
O.E. R()SSLER
Institute for Physical and Theoretical Chemistry, University of TSbingen, F.R.G. (Received May 20th, 1980) A blueprint for a geometric information processor is described. The system essentially combines a digital scan converter with a digital flight simulator. The latter's 'local' (Poincar4an) rather than standing (Helmholtzian) display may have advantages in 3-dimensional diagnostic imaging. At the same time, the system provides a technologically realizable abstract model in terms of which to express (and perhaps eventually explain) the experimental results of O'Keefe and Nadel on the functioning of the hippocampus in the mammalian brain.
1. Introduction Helmholtz (1867) pondered the apparent stability of the visual world under voluntary eye movements; in 1878, he remarked on the fact that in dreams, the perceived landscape may change in a realistic fashion during locomotion. Poincar~ (1905) implicitly stated that 'expected' changes of the visual scene during actual locomotion and 'imagined' changes of the visual scene during merely internally represented locomotion (as in geometric thinking) might be based on the same group-theoretic mechanism. Later, Tolman (1948) presented observational evidence of what he called a 'cognitive map' in the behavior of rats running in a maze. More recently, O'Keefe and Nadel (1978) put forward the specific hypothesis that in rats and other higher animals like primates, the hippocampus is an organ that has something to do with the formation, or at least the subsequent storing/retrieving, of an objective, map-like representation of the environment. In biology, such ideas are exceedingly difficult to test and prove directly. The formation of artificial models therefore constitutes a necessary step in the analysis. Fortunately, in the age of computers, the setting up of model systems which are compatible with the Helmholtz-Poincar~-Tolman hypothesis is not t o o difficult.
In the following, one possible such model is described. Some suggestions h o w to make it more realistic axe also provided.
2. A first, Helmholtzian ingredient The first ingredient to be used is a digital scan converter (see, for example, Ophir and Madlak, 1979). This is an electronic device which allows to scan a certain three-dimensional object (for example, a part of the human body) with a hand-held ultrasonic transducer possessing a narrow sensory angle, in a random fashion. The digital image processor behind then, in spite of the random movements of the hand, produces a standing picture on a T.V. monitor. Usually, the motions of the hand are confined to a plane, so that the picture represents a standing two-dimensional cross-section. A generalization to three dimensions is possible. The device is based on the principle that incoming information is stored according to the -- simultaneously incoming -- information a b o u t the m o m e n t a r y position and angle of the transducer. The spatial information thus is being stored according to spatial coordinates. In principle, the temporal coordinate is also available. Both an 'averaged' image (namely, averaged over several sweeping movements of the hand) and a temporally parameterized
204 image can be p u t on screen. That is, both spatial and temporal 'zooms' are possible (Ophir and Madlak, 1979). Recently, an a t t e m p t at building a very fast, real time, three-dimensional system of this kind has been reported being underway {Wood et al., 1979). In the present, abstract context, it suffices to point out that the very straightforward way in which such a technological system is realized -- storage of image information in a pixel-space {that is, in a m e m o r y array in which neighboring addresses correspond to neighboring positions in s p a c e ) yields one possible implementation of Helmholtz's (1866) principle. Of course, it is not necessary to represent the sequentially acquired spatial information in a spatial manner also 'hardwarewise'. Only its retrievability (such as if it had been stored in this way) is what is of interest in the present context. 3. A second, Poincar~an ingredient The second ingredient to choose is a digital flight simulator. Flight simulators are being used in the training of pilots. Their modern digital versions usually cost more than the real airplanes they simulate; however, a simple implementation on a home computer is also available (Sublogic, Inc., Champain, IL). This time, one already has a certain spatial environment represented in a c o m p u t e r memory. The pilot can 'fly' any course he wishes, and the pictures presented to him through the cabin's windows, via T.V. monitor, must be realistic. Evans and Sutherland were among the first to provide for digital systems with this capability (cited after Nelson, 1975). The internal organization is just the converse of that of a digital scan converter. Therefore, the same transformation matrices can be used, for example (see Newman and Sproull, 1979).
is straightforward. The same control knob can be used both for the control of the actively mobile transducer, collecting data into the digital m e m o r y such that they become addressable according to space and time, and for the retrieval of data from this m e m o r y under a merely simulated motion of the same transducer. At first sight, this seems to be an odd combination. Why should one combine simulational retrieval with active gathering? There is at least one practical reason, however. Since the gathering is always only local and 'narrow-beamed', it does make sense to complement this narrow actual picture with all there is in the store, toward a full local view within which the presently looked at details are imbedded. This makes the local scrutiny more effective. The combined mode of action, just described, is of course just one possibility. Doing the two things in succession -- collecting first, scrutinizing the obtained image later -- is, of course, also possible. However, if one has such a device available, there is actually no reason why not to switch on the simulational mode simultaneously with the gathering mode. In this way, Poincar~'s (1905) implicit idea to use the same machinery both for the generation of 'expected' environmental changes during actual motion (which may then be compared with the actual changes that are being perceived) and for the generation of merely 'imagined' changes under an internally simulated motion, makes technological sense also. In technology, the simulational machinery has interestingly been provided first. The idea to use it also in the real situation came second. The proposed name 'cognitive map system' stresses the central role played by the objective (map-like) central m e m o r y in the hybrid system. 5. A blueprint for the combined system
4. 'Digital scan convertor + digital flight simulator = cognitive map system' Combining the two technological principles
A functional block diagram showing the mode of action of the combined system is provided in Fig. 1. The organization is easy to
205
()
() S
Fig. 1. Blueprint of a combined collector and retriever of environmental information under arbitrary motion ('Helmholtz-Poincar~ Simulator'). Double arrows = input and output to transducer; S = screen with realistic perspective display; C = hand-held control knob for motion in space (main control) and in time (c); s = motor switch (controlled by C); AM = active memory device; DSC = digital scan convertor (using AM); DFS = digital flight simulator (using AM). understand. Th er e is a central active m e m o r y (AM) into which the environmental information is being fed along with the positional and t e m p o r al data o f the transducer. AM and DSC (the loading-in software/hardware t ha t is also used in digital scan convertors) together comprise the 'first c o m p o n e n t ' of the system. T h e pair AM and DFS t oget her realize the 'second c o m p o n e n t ' . DFS here comprises the same software (and, perhaps, hardware) packages t h a t are being used in digital flight simulators. Th e mo ni t or i ng screen S is being fed bo th types of i nf or m a t i on: the one retrieved f r o m AM via DFS, and t h a t coming in directly. Th e c o n t r o l k n o b C controls bot h DSC and DFS simultaneously. T her e are t w o appendages to the c o n t r o l knob: the switch s cuts of f the m o t o r s so t ha t the transducer becomes immobile and inoperative. If t h a t is the case, th e system acts as a pure flight simulator, allowing f or b o t h c o n t i n u o u s and saccadic motions. T h e o t h e r knob, c, can be used only in this case. It allows for 'temporal flights' within the stored data. In c o m b i n a t i o n
with C, c o m b i n e d 'rehearsal' and 'pure f a n c y ' flights are possible. U n f o r t u n a t e l y , the i m p l e m e n t a t i o n of such a system will be a bit costly if really all the c o m f o r t of a three-dimensional scanning and flying unit is desired. For example, one then needs a 360 ° panoram a screen, plus t w o matching screens in floor and ceiling. And the display must, of course, be stereoscopic. {There are several ways to realize this in a m ore or less satisfying manner). On the ot her hand, simpler versions functioning in simplified envizonments (with the objects to be scrutinized being outline-drawings, for example) will be m uch easier to i m pl em ent and test.
6. Some proposed generalizations T he system proposed so far differs n o t m u c h from an ordinary digital scan convertor. T he main difference is t hat the display is n o t of t he invariant (Helmholtzian) type, but rather of t he local, b o t h polar and panoramic
206 ,-_J
L_ L___J
®
".~) "~L
~,
® =~1
DI
Fig. 2. C o n t i n u o u s m o d e H e l m h o l t z P o i n c a r 4 S i m u l a t o r . 1--5 = m o d i f i c a t i o n s w i t h r e s p e c t t o Fig. 1; PM = a l i b r a r y o f passive m e m o r y devices, c o u p l e d t o A M via a ' t u r n t a b l e d e v i c e ' (T); OB = overlap b u f f e r ; A O = a u t o n o m o u s o p t i m i z e r (replacing b o t h S a n d C). See t e x t for f u r t h e r e x p l a n a t i o n s .
(Poincar~an) type. Whether or not the opport u n i t y to make such 'realistic zooms' too will make much of a difference in applied body scanning is hard to tell. Local scrutiny can then be applied both under the simulation and under the subsequent real scanning. So the maximum scrutiny can be applied (and the maximum resolution gained) where it is needed the most. However, the proposed device is of a broader interest too. It can also be used as a 'thinking aid' as far as spatial information processing is concerned. It is this more 'cognitive' function which calls for some minor modifications. They are all incorporated in Fig. 2. The first modification is a passive m e m o r y (PM) connected to AM. It allows to swap the c o n t e n t of AM into a passive storage device as soon as AM is 'full'. In this way the system can be used in m a n y environments successively. Also, a prolonged use in the same environment, with scrutiny of different portions in succession, becomes possible. The second modification has a related aim:
to allow for continual operation. This is, in the simplest case, achieved by rendering AM a buffer, that is, a first-in-first-out m e m o r y {FIFO). If the buffer spills over continually into PM, the discontinuous operation no longer involves AM itself, but only the changing of the 'buckets'. On the other hand, everything older than the lifetime of information in AM will still be cut off suddenly from immediate accessibility. Hence modification 3. Line 3 is a 'shortcircuit bus' between screen and input. It makes sure that nothing which was on screen recently (due to the action of DFS) will be lost from AM and hence from immediate access. This option corresponds to a 'reverberation m o d e ' of the whole system. The fourth modification is the least trivial. It consists in the introduction of an 'overlap buffer' (OB) that can be switched on and off. It automatically reproduces the respective last sequence of controlling motions applied via C. It at the same time plays back the corresponding outputs of DFS. This makes it possible to carry out automatically a formerly simulated
207 sequence of actions. This automatic execution has the asset that it permits, while going on, the simulation of something else (for example, the next segment of motions). In this way, two different sequences of anticipated views of the environment can be present on screen simultaneously. The last-mentioned option can, of course, also be used to 'look ahead' a little during the automatic execution of a sequence of movements which had appeared optimal during the preparatory simulation. This corresponds to a 'double checking' m o d e in the simulation. If one so wishes, one m a y call this option a recursive option, because the number of levels which can be super-imposed on screen using this option is in principle unlimited. The final option (number 5) consists in replacing both the screen S and the control k n o b C by an a u t o n o m o u s optimizer (AO). Hereby, for example the simple ' a u t o n o m o u s direction optimizer' described in RSssler {1974) may be used.
7. Discussion An artificial system has been described which makes exclusive use of hardware and software elements that are currently available and in use. The way they were put together was ultimately suggested by Poincar~'s hypothesis h o w spatial imagination is functioning. This hypothesis was more recently reinforced by Tolman (1948) and O'Keefe and Nadel (1978) and -- in a similar vein -- Olton (1977) in biological experiments on the spatial orientation of mammals. The blueprints of Fig. 1 an~l, with more detail, Fig. 2 can be read in t w o ways. Firstly, they can be interpreted as a technological blueprint ready for implementation. A possible use -- diagnostic imaging -- has been suggested. Secondly, they can be looked at as a blueprint for an abstract cognitive system. Viewed as such, the present approach falls within the realm of artificial intelligence.
There is a difference of orientation, however. Current artificial intelligence research is n o t t o o closely interested in providing biological models. Minsky's (1975) approach to the problem of imagining -- one of several notable exceptions -- provides a kind of countermodel to the present approach. It deemphasizes, so to speak, the importance of a continuous spatial representation, in favor of a discrete sequence of 'frames' (that is fixed spatial scenes and subscenes). This feature is, on the other hand, incorporated implicitly in the present system also. PM in Fig. 2 can be looked at as a library of frames in loaded by and loading into AM which itself contains a standing (Helmholtrian) 'frame' (or superframe). Conversely, the discrete theory of frames can, perhaps, be complimented by a 'small scale' continuous simulator. In that case, a model much like that of Figs. 1 and 2 might be reobtained. A model of Albus {1979), related to one of Marr (1969), is also compatible with the present approach. This model is hierarchical, stressing the importance of m o t o r learning at several levels. All the lower-level units of Albus' model can be incorporated into the present model (in place of some of its 'prewired' connections). Conversely, the present model is appropriate to be incorporated into Albus' model on the highest hierarchical level. Valach (1978) recently, described a number of principles that have to be observed in the design of a useful artificial eye-substitute for the blind. The main constant is fast ('involuntary') controllability -- such that a rigid anticipation structure analogous to that implicit in Fig. 1 can develop in the long run. The single main reason for presenting the above model has, however, been O'Keefe and Nadel's (1978) book. It provides a fresh, learned look at the function and structure of a certain part of the brain. Several of the functional observations these authors describe could as well have been obtained on the system of Figs. 1 and 2 (for example, by inserting electrodes into the region around T in a highly parallel implementation of Fig. 2):
208 Again, 'place neurons', representing the environment in an invariant manner and being activated either by 'actually moving' toward that place or by a merely 'programmed' motion with blocked execution (to use O'Keefe and Nadel's terms), would be found. On the other hand, the present blueprints seem to be about as complicated as any artificial system reproducing O'Keefe and Nadel's observations must be. Of course, m a n y more problems are being opened up than solved with the system of Fig. 2, say. A major design question is how to realize an optimum retr~ieval of information out of PM into AM. Another major problem, not touched upon in the above discussion, is that of tolerance (Poincar~, 1905; Zeeman, 1965). In order to keep AM small, most of the information contained in it should be represented at high tolerance (low spatial resolution), thereby requiring little m e m o r y space. Reverberation (line 3 in Fig. 2) then leads to the mathematical problem of cyclic information reproduction under a condition of varying tolerance. This problem defines a new class of generalized near-to-the-identity diffeomorphisms with interesting asymptotic properties (work in preparation). One special aspect of the system of Fig. 2 has to be mentioned finally: working with it will be fun. For example, one can 'load in' an arbitrary environment within which it is then possible to move about freely. (This makes the system qualify as a new kind of T.V. game.) If it is true that much creative and playful activity consists of thinking in pictures, then the best test of whether the present model system comes close to the way in which these cognitive processes are organized in the nervous system will be whether people will like to work with it and improve it. A d d e d in Proof. (1) There is a strong selection pressure on mobile organisms with nonpanoramic sensors, to develop an internal transformation mechanism which brings earlierspotted environmental features into an updated relationship to the animal's actual posi-
tion (M. Conrad, personal communication). If such a machinery is then used without (much) overt locomotion, the result is 'vicarious trial and error behavior' (see Tolman, 1948). (2) MacKay and Mittelstaedt (1974) review physiological control theories in the spirit of Helmholtz. (3) Kosslyn and Shwartz (1977) give a computer model of long-term-memorybased visual imagery. (4) Olton et al. (1979) stress temporal over spatial simulation in the hippocampus.
Acknowledgement I thank T o m m y Poggio, Art Winfree, and H. Emde for discussions.
References Albus, J., 1979, A model of the brain for robot control, Part 4, Mechanisms of choice. Byte 4 (9), 130--148. Helmholtz, H., 1866, Handbuch der Physiologischen Optik, Transl. in: Helmholtz's Treatise on Physiological Optics, J.P.C. Southall (ed.), Vol. 3, Sections 26, 27 and 29 (New York: Optical Society of America 1926). (Also: Dover edition in 2 vols., 1962). Helmholtz, H., 1878, The facts in perception, in: Helmholtz on Perception, R.M. Warren and R.P. Warren (eds.) (Wiley, New York, 1968) pp. 205-246, p. 224. Kosslyn, S.M. and S.P. Shwartz, 1977, A simulation of visual imagery. Cognit. Sci. 1 , 2 6 5 - - 2 9 5 . MacKay, D.M. and H. Mittelstaedt, 1974, Visual stability and motor control (reafference revisited), in: Cybernetics and Bionics, W.D. Keidel, W. H~indler and Spreng (eds.) (Oldenbourg, MunichVienna) pp. 71--80. Mart, D., 1969, A theory of cerebellar cortex. J. Physiol. 2 0 2 , 4 3 7 - - 4 7 0 . Minsky, M., 1975, A framework for representing knowledge, in: The Phychology of Computer Vision, P.H. Winston (ed.) (McGraw-Hill, New York) pp. 211--280. Muenzinger, K.F., 1938, Vicarious trial and error at a point of choice, I: A general survey of its relation to learning efficiency. J. Genet. Psychol. 53, 75--86. Nelson, T. 1975, Computer Lib (Nelson, Ted, Publisher, Schooleys Mountain, N.J.).
209 Newman, W.M. and R.F. Sproull, 1979, Principles of Interactive Computer Graphics, 2nd edn. (McGrawHill, New York). O'Keefe, J. and L. Nadel, 1978, The Hippocampus as a Cognitive Map (Oxford University Press, OxfordNew York). Olton, D.S., 1977, Spatial memory. Sci. Am. 236 (6), 82--98 (June Issue). Olton, D.S., J.T. Becket and G.E. Handelmann, 1979, Hippocampus, space and memory. The Behav. Brain Sci. 2 , 3 1 3 - - 3 6 5 . Ophir, J. and N.F. Madlak, 1979, Digital scan converters in diagnostic ultrasound imaging. Proc. IEEE 6 7 , 6 5 4 - - 6 6 4 . Poincar~, H., 1905, La Valeur de la Science (Flammarion, Paris, 1970). Rossler, O.E., 1974, Adequate locomotion strategies for an abstract organism in an abstract environment : A relational approach to brian function, in: Physics and Mathematics of the Nervous System, M. Conrad, W. Giittinger and M. DalCin (eds.), Springer Lecture Notes in Biomathematics 4, 342--396. Rossler, O.E., 1978, Deductive biology -- some cautious steps. Bull. Math. Biol. 40, 45--58.
Tolman, E.C., 1948, Cognitive maps in rats and men. Psychol. Rev. 55, 189--208. Reprinted in: Image and Environment, R.M. Downs and D. Stea (eds.) (Aldine, Chicago, 1973) pp. 27--50. Valach, M., 1980, Artificial means for experiencing sight in the blind and its relation to the principle of awareness in living systems, in: Systems Science and Science, Proc. 24th Annual North American Meeting, Soc. General Systems R e , a r c h with the AAAS, B.H. Banathy (ed.) (Louisville, Ky.: Soc. for Gen. Syst. Res., University of Louisville) pp. 439--443. Wood, E.H., J.H. Kinsey, R.A. Robb, B.K. Gilbert, L.D. Harris and E.L. Ritman, 1979, Applications of high temporal resolution, computerized tomography to physiology and medicine, in: Image Reconstruction from Projections, G.T. Herman (ed.) ( Springer-Verlag, Berlin) pp. 247--279. Zeeman, E.C., 1965, Topology of the brain, in: Mathematics and Computer Science in Biology and Medicine, Medical Research Council Publication (HMSO~ London) pp. 277--292.