Neural Networks, Vol. 7, No. 9, pp. 1351-1378, 1994 Copyright © 1994 ElsevierScienceLtd Printed in the USA. All rights reserved 0893-6080/94 $6.00 + .00
Pergamon 0893-6080( 94 )E0044-L
CONTRIBUTED ARTICLE
New Neural Multiprocess Memory Model for Adaptively Regulating Associative Learning BRENDAN L. ROGERS Swinburne University of Technology
(Received22 March 1993; revisedand accepted23 February 1994)
Abstract--This article describes the development, operation, and behaviour of the neural multiprocess memory model (NMMM). The NMMM is a new artificial synaptic multistage memory system that replaces the single associative long-term memory that normally modulates the efficacy of artificial neural network (ANN) element inputs. The NMMM enhances learning in an ANN element by supporting multiple time scales in the acquisition, retention, and recall of encoded associative memory traces, and by appropriately modulating the learning rate based upon previous learning experience over many trials. Presented computer simulation results indicate that the NMMM on its own supports spontaneous regression and recovery behaviour and U-shaped memory retention, and that when functioning as part of the associative conditioning element (ACE), the NMMM's new adaptive associability mechanism is capable of supporting both negatively accelerated and sigmoidal acquisition curves, latent inhibition, learned irrelevance, the partial reinforcement effect, and accelerated learning following alternating acquisition-extinction training sessions.
Keywords--Artificialneural networks, Short-term memory, Medium-term memory, Long-term memory, Associability, Consolidation,Associativeconditioning element, Classical conditioning. 1. INTRODUCTION
so that subsequently recalled performance exhibits a relatively extensive conformation to the current situation in the local temporal context of this situation, and then later a more balanced alteration in recalled performance that places a higher weighting on all prior accumulated experience. Second, the NMMM derives from experience over m a n y successive trials an individual learning rate that is likely to be appropriate in future trials for each associative link. Learning and memory tend to be addressed as conceptually discrete processes, with learning pertaining to the acquisition of associations, and memory responsible for the retention of them. However, because memory is a necessary component of learning, the retention and transfer characteristics of the memory employed may affect the learning process, and the temporal characteristics of transfer from acquisition to deployment. It is suggested here that many apparently non ideal information processing and integration characteristics associated with biological memory, such as multistage memory transfer, consolidation, and spontaneous changes in memory retention, may actually complement and enhance the learning and deployment processes. The observation that memory in biological systems appears to be distributed among many adaptive synapses between neurons in a fine-grained manner (Eccles & Mclntyre, 1953; Eccles, 1964; Ungar, 1970;
A feature of many artificial neural networks (ANNs) is their ability to learn the strength of associative relationships that they experience. These ANNs support associative learning using some set of learning rules that modify the individual synaptic connection weightings between ANN elements in response to detection of the events to be associated. This article demonstrates how a new artificial neural multiprocess memory model (NMMM) may be employed in place of the single longterm memory conventionally used in each synaptic link of an ANN element to supplement the learning process generated by an ANN element's learning rules. The NMMM is able to supplement the associative learning process in two ways. First, the NMMM temporally buffers the effect of an ANN element's learning rules
Acknowledgements: This article was based upon parts of the PhD thesis by Rogers ( 1991 ). Computer facilities used in the preparation of this article were provided by Learning Services and The Department of Mathematics, Swinburne University of Technology (Eastern Campus). The author gratefully acknowledges the assistance of John Wallace, James Lambert, and Richard Dluzniak for comments on drafts culminating in this paper. Requests for reprints should be sent to Brendan L. Rogers, Information Technology Institute, Swinburne University of Technology, P.O. Box 218, Hawthorn, Victoria 3122, Australia.
1351
1352 John, 1972), and that this provides the opportunity for the mechanisms underlying some types of acquisition, retention, and deployment to be tightly integrated and interdependent, also lends some support to the symbiotic relationship that is advanced here to exist between biological memory and acquisition and recalled performance. Ideas relating to multistage memory, albeit at a higher cognitive level than that which is normally addressed by ANN researchers, have a considerable history in the field of psychology. William James ( 18901981 ) made a distinction between primary memory, which he equated with consciousness, and secondary memory, which he equated with the stored knowledge of previous experience. Later, such a distinction was incorporated into information-processing models being explored by research psychologists, such as the filter theory of attention by Broadbent (1958), and then more explicitly expressed in the Waugh-Norman twocomponent model of memory (Waugh & Norman, 1965) and the more detailed and influential AtkinsonShiffrin multiprocess memory model (Atkinson & Shiffrin, 1968, 1971 ). A more recent elaboration is Broadbent's Maltese cross multistore model (Broadbent, 1984). Memory has been empirically observed to be a multiphasic process in even relatively humble vertebrates and invertebrates, with evidence of distinct short-term memory (STM), long-term memory (LTM), and in some cases also medium-term memory (MTM) (e.g., McGaugh, 1966; Halstead & Rucker, 1968; McGaugh, 1969; Chen, Aranda, & Luco, 1970; Young, 1970; Messenger, 1971; Riege & Cherkin, 1971; Sanders & Barlow, 1971; Menzel, Erber, & Masuhr, 1974; Menzel, 1984). The methodology underlying this article involves deriving a novel characterisation of multiple time-scale learning modulation techniques employed by biological organisms in a diverse range of (primarily) classical associative conditioning experiments, and the development of a new concise ANN element mechanism (the NMMM) that supports this new characterisation of these techniques so that they may be employed in biologically relevant applications. No attempt has been made here to produce or to identify any specific correspondence between aspects of the internal mechanisms in the NMMM and known aspects of synaptic plasticity in individual biological neurons, and no claims that any such specific correspondence exists are made. Also, the fact that the NMMM as a synaptic mechanism in an individual ANN element supports the range of behaviour presented below should not be taken to necessarily mean that such behaviour is actually supported by individual biological neurons, or their synapses. In other words, a "black-box" design approach has been adopted here. The NMMM was developed by drawing primarily upon
B. L. Rogers a combination ofbehavioural animal experiments using a variety of species, and a consideration of what constitutes useful and appropriate adaptive behaviour. No aspect of the NMMM's internal design, other than the use of simple analog quantities and types of interaction that may feasibly be produced by simple chemical processes, has been based upon neurophysiological data. However, the decision to adopt the general approach here of exploring the behavioural capabilities of a relatively complex synaptic ANN element mechanism was influenced by the observations that learning and memory appear to be supported by a highly localised and relatively complicated synaptic mechanism in the marine mollusc Aplysia (Pinsker et al., 1970; Carew, Pinsker, & Kandel, 1972; Carew, Walters, & Kandel, 1981; Hawkins, 1981; Hawkins, CasteUucci; & Kandel, 1981; Kandel & Schwartz, 1982; Carew, Hawkins, & Kandel, 1983; Hawkins et al., 1983; Waiters & Byrne, 1983; Carew et al., 1984; Hawkins, & Kandel, 1984), and that several relatively complex ANN neuronal models of classical conditioning have already been developed with varying degrees of correspondence to the known mechanisms in Aplysia (Sutton & Barto, 1981; Gluck & Thompson, 1987; Desmond & Moore, 1988; Buonomano, Baxter, & Byrne, 1990; Card & Moore, 1990). The NMMM is designed as a module for implementation within biologically relevant ANN elements that also incorporate learning rules, temporal eligibility stimulus traces, and element output activity equations (e.g., Sutton & Barto, 1981; Gluck & Thompson, 1987; Grossberg & Schmajuk, 1989; Rogers, 1991, 1993b). Consequently, computer simulation results illustrating the NMMM's more sophisticated capabilities may only be provided when the NMMM is functioning within a complete ANN element. However, understanding how the NMMM functions is not dependent upon a thorough understanding of the operation of the ANN element in which it is embedded. Although computer simulation results illustrating the NMMM's spontaneous memory retention characteristics are presented here with the NMMM functioning in isolation, those results presented, indicating how the NMMM affects learning and performance during particular training schedules, were generated with the NMMM functioning as an integral part of the associative conditioning element (ACE) (Rogers, 1991, 1993b). ACE was developed to further explore the extent to which a relatively complicated and self-contained ANN element could account for a broad range of empirically observed adaptive behaviour that also has readily apparent adaptive value. ACE includes a new conditioned stimulus trace circuit (CSTC) (Rogers, 1993a), a new set of learning rules (Rogers, 1991, Chap. 7), and the NMMM. The NMMM is a new nonlinear system of interacting synaptic memory types, and it is the development and behaviour of the NMMM with which this paper is primarily concerned.
New Neural Multiprocess Memory Model
This article is divided into two main parts. The first part progressively develops the NMMM in a step-bystep manner, and uses as its starting point the single associative LTM routinely employed in ANN elements. This first part is organised into sections as each new memory type is added to enhance the adaptive behaviour of the model. After the NMMM has been fully developed in isolation, it is presented in the second part of this article as an integral part of ACE, and computer simulation results are presented that illustrate the contribution of the NMMM's new adaptive associability mechanism to the learning and recall process. This second part is organised into sections according to what in the experimental psychology literature have become standard classifications of types of associative learning behaviour. 2. CONVENTIONAL ANN ASSOCIATIVE MEMORY Conventional ANNs normally employ a single type of associative memory to retain the weighting for each synaptic interconnection between ANN elements, as illustrated in Figure 1. The connection weighting memory is usually long term in nature, typically exhibiting indefinite perfect retention, or in some cases gradual passive decay (e.g., Anderson et al., 1977). In biologically relevant ANNs that learn, this LTM reacts immediately to local conditions using some set of learning rules, usually at a rate of change that is limited to reduce the impact of recent events and facilitate the integration of experience over many trials. In the context of classical conditioning, the synaptic connections between ANN elements are considered to acquire, store, and deploy some component of the associative strength or predictive value of an initially ineffective conditioned stimulus (CS) that is presented shortly before an unconditioned stimulus (US). Although the US automatically generates its own unconditioned response (UR), the CS begins to generate a conditioned response (CR) of its own only after repeated paired presentations of the CS and then the US. The value of a synoptic connection weighting therefore determines, or contributes to, the strength of the CR able to be produced when the CS is presented alone. As considerable reference to empirical data from classical conditioning
CS•
LTM Learning Rules
Input =~_J
CS Outp=ut
RGURE 1. Standard configuration of synaptic memory for ANNs. The operator immediately below LTM is analog multiplication. Individual CS output signals are typically summed at each ANN element. The CS input and CS output signals correspond to synaptic input and synaptic output signals, respectively.
1353
experiments is made in this article, the artificial synaptic input signal illustrated in Figure 1 is labelled as the CS input, and the synaptic output signal that is able to contribute to the production of the CR is labelled the CS output. When STM is considered in an ANN it is usually implemented by temporarily sustaining ANN clement output activity so that it slowly, passively decays after being stimulated. This is sometimes supplemented by reentrant positive feedback connections from the ANN element's output to its inputs (e.g., Grossberg, 1988). Such element-based CS-nonspecific STM may be employed to help support cooperative-competitive interclement interactions to process parallel streams of input activation. However, although this type of STM responds to associative relationships, it only directly stores nonassociative information. Even though an associative relationship between two directly connected ANN elements may be indirectly encoded and temporarily retained in the simultaneously active nonassociative output signals of the two elements, this storage is very sensitive to disruption from the simultaneous short-term storage of other associative relationships that include one of these two ANN elements. Furthermore, because the element activity itself corresponds to this type of STM, it becomes a nontrivial problem for the network if it is required that it differentiate between stored STM contents and retrieved STM contents in the process of producing a response. These difficulties may be overcome by considering what is in the standard ANN literature an uncommon concept; STM that is an integral part of synaptic processing. The type of STM advanced here is directly associative, and temporally retains experience-induced changes in synaptic efficacy rather than transmitted signal activity. This type ofsynaptic STM, like standard synaptic LTM, is distinct from ANN element output activation or CS input activation. Once this concept is established, it is then quite natural to consider how other synaptic memories may also be incorporated to progressively develop the NMMM and to produce a successive and cumulative extension in learning capability. Hence, this article will illustrate how it is possible to employ associative STM (and later below other synaptic memories) to substantially enhance the learning capability of an ANN element. Very few ANN researchers have explored the behavioural contribution that this type of synaptic memory system may make. The starting point here for considering how addition of associative STM may enhance learning is to consider the experimentally observed phenomena of spontaneous regression and spontaneous recovery. Spontaneous regression describes a partial decline in recalled performance that spontaneously occurs after acquisition training during a period of rest, as shown in Figure 2. In contrast to the simple passive decay
1354
B. L. Rogers"
Response Amplitude
S0oo,anoou,t
Acquisition Training
Rest Period Time
Response' Amplitude
Extinction Training
:
Rest Period
Time
FIGURE 2. Qualitative illustration of spontaneous regression behavior following acquisition training, and spontaneous recovery following subsequent extinction training.
behaviour that synaptic LTM may support on its own, in which responding (as revealed by CS-alone presentations) typically declines slowly towards zero, spontaneous regression is characterised by a (usually) more rapid decline to an intermediate level of responding. The counterpart to spontaneous regression is spontaneous recovery, which describes a partial restoration of recalled performance towards that achieved in previous acquisition training that spontaneously occurs after extinction training during a period of rest (Figure 2). Spontaneous recovery has been observed within both classical (e.g., Pavlov, 1927, p. 58; Schneiderman, Fuentes, & Gormezano, 1962 ) and operant (e.g., Ellson, 1938) conditioning experiments. Although given less attention, and tending to be less pronounced than spontaneous recovery, spontaneous regression has also been observed in classical (e.g., Konorski, 1948, p. 83 ) and operant (e.g., Mote & Finger, 1943; Kamin, 1963; Spear, Hill, & O'Sullivan, 1965 ) conditioning experiments. The extent of the spontaneous regression or recovery behaviour empirically observed is typically less than that depicted in Figure 2. Spontaneous regression and spontaneous recovery behaviour have received relatively little attention from ANN researchers, even though they may reveal an advantageous means of temporally combining slow and fast learning rates. When only synaptic LTM is used (Figure 1 ), as is usually the case in ANN elements, an appropriate LTM learning rate needs to be determined. A fast rate of learning will enable a system to rapidly adjust its behaviour to suit novel or changing environmentally imposed contingencies, and may be appropriate when these contingencies are of a consistent nonstatistical nature. However, if the experienced con-
tingencies are of a statistical nature, as may be common in a complex environment, then a slower learning rate has the advantage of being able to capture the mean long-term contingencies from many individual trials. Spontaneous regression and recovery behaviour seem to incorporate both slow and fast learning rates, in a manner that is dependent upon the time that has elapsed since a trial. On the time scale of individual trials, learning may be able to occur at a rate faster than that which is optimal for integration of long-term experience, so that temporarily contiguous performance may more rapidly adjust to recent contingencies. However, as time elapses, long-term experience is permitted to partially reassert itself, with the net steadystate result (reached sometime after the last trial) being a blend between short- and long-term experience more like that which would have been produced using a single, slower learning rate. This concept of multiple time scales in the acquisition, retention, and recall of encoded memory traces is an important feature of biological adaptive behaviour, and is explored further below when experience over many successive trials is employed to adaptively modulate the learning rate. However, the next step here is to present a simple type of underlying mechanism that supports spontaneous regression and recovery behaviour, and to demonstrate that it supports the temporal combination of multiple time scales in learning and recalled performance associated with this spontaneous behaviour. 3. ASSOCIATIVE STM AND LTM Since Parlor (1927, p. 58) observed spontaneous recovery behaviour, the pursuit of a theoretical explanation has gained the attention of many notable researchers (Pavlov, 1927, p. 60; Skinner, 1938; Hull, 1943, pp. 285-286; Skinner, 1950; Estes, 1955 ). A new synaptic multiprocess memory configuration integrating associative STM and LTM, and that represents the first stage in the formation of the NMMM, is presented in Figure 3. This configuration is intended to account for both spontaneous recovery and spontaneous regression. That spontaneous regression and recovery may result from the operation of a common mechanism was suggested some time ago by Estes ( 1955). However, the N M M M provides a new and relatively simple account for these spontaneous phenomena, and does so in the form of a functioning model rather than just a theoretical construct. It will be shown later that this new memory system is also easily extended to support other important memory-related phenomena of substantial behavioural utility. Figure 3 illustrates that within the proposed N M M M a synaptic (CS-specific) form of STM displaces the synaptic LTM (Figure 1) normally used to modulate the efficacy ofinterelement pathways, whereas LTM is relegated to a background role in which it is not directly
New Neural Multiprocess Memory Model
÷/-
1355
•
~+/. STM -/+
I
Term ~ Memory )
CS ~ Input =
Learning Rules
CS Output
FIGURE 3. The NMMM, integrating associative STM and LTM. STM and LTM may attain positive (excitatory) or negative (inhibitory) values, and all signals are bipolar except for the synaptic input, which is restricted to nonnegative values. + / - indicates that the effect of the signal matches its polarity, and - / + indicates a converse relationship. The operator immediately below STM is analog multiplication, and that immediately above STM is analog summation.
accessible. The impact of LTM upon performance is now only indirectly apparent through its influence upon STM. In turn, LTM value is only indirectly affected by experience through STM. Two distinct time scales are supported by the twomemory system illustrated in Figure 3. Relatively rapid changes in STM value are able to result from the effect of whatever STM learning rules are employed in the ANN element in which the N M M M is embedded. These rapid changes occur during and possibly shortly after a trial, over a period that is in the order of seconds. Relatively slow spontaneous changes in both STM and LTM that occur over a period of minutes also occur, but these result purely from the configuration of STM and LTM, and their interaction, in the NMMM. These two time scales are intended to correspond to those apparent in spontaneous regression and recovery behaviour. To demonstrate the spontaneous retention behaviour of this configuration, the effect of an experienced trial upon STM (through whatever system learning rules are employed in the ANN element in which the N M M M is embedded) is simulated here by altering the initial value of STM. It is assumed that the learning rules will decrease (extinguish) STM value towards zero if the CS is presented alone, and that they will increase STM towards some positive nonzero value if the CS is reinforced by a subsequent US presentation. [An example of a set of learning rules that have this effect is the first two terms in the right-hand side o f e q n (A. 15 ) in the Appendix]. Because the experience-induced changes in STM will occur at a much faster rate and act over a briefer period than spontaneous changes, the technique employed here of initially setting STM value to zero to simulate a nonreinforced trial, and to a value of 1 to simulate a reinforced trial, approximately reproduces the nature of the subsequent spontaneous response of the N M M M ' s STM to recently experienced trials of each type.
The preliminary version of the N M M M illustrated in Figure 3 is defined by differential eqns ( 1 ) and (2), which were integrated using a modified version of Euler's method in which a unity interval between successive time states was assumed, so that the values of the equation constants correspond to those employed when the differential equations are converted to difference equations and simulated on a digital computer. The actual interval between successive system states corresponds to 10 ms in real time. The decision to set this to 10 ms is somewhat arbitrary, the main consideration being that it be brief enough to prevent time quantisation effects from becoming apparent. The 10-ms interval is smaller than necessary to simulate the relatively slow spontaneous interaction behaviour between memories in the NMMM. However, as alteration of the interval between time states will affect the absolute values of the constants in eqns ( 1 ) and (2), and because the N M M M is designed to function as a component of a complete ANN element, the interval between time states should also be brief enough to support the more rapid internal changes within the ANN element in which the N M M M is embedded. Because an interval of 10 ms is used in the ANN models of classical conditioning of Rogers ( 1991, 1993b) (and also Desmond and Moore, 1988), 10 ms is also used here.
3.1. Basic NMMM Equations Using Only S T M and LTM d
STM = STMchg(LTM - STM) (1)
d
LTM = LTMacc [ STM - LTM ] + - LTMdep [ LTM - STM ] + (2)
where STM = synaptic short-term memory at time t, STMchg = rate of change of STM, LTM = synaptic long-term memory at time t, LTMacc = LTM accumulation rate, and LTMdep = LTM depletion rate. Ix] + = x, i f x > 0, and [x] + = 0, i f x < 0. Synaptic STM and LTM may be regarded as cumulative quantities that take a finite time to change value. Equations (1) and (2) relate directly to the N M M M schematic diagram (Figure 3), with the addition that a rate constant is associated with every signal that affects the value of a cumulative quantity. STMchg is a constant that determines the rate at which STM changes, controlling the rate of both accumulation (when LTM > STM) and depletion (when LTM < STM). LTMacc determines the rate at which LTM accumulates, and LTMdep the rate at which LTM depletes. This type of nomenclature is standard throughout this paper, with three or four upper case characters mnemonically denoting system variables. Symbols for
1356
B. L. Rogers
in which the accumulation rate of LTM is twice that of its depletion rate, and simulated learning rule-induced changes in STM result in a spontaneous regression of 33% but a more complete spontaneous recovery of 50%. The way in which STM and LTM are organised within the NMMM (Figure 3 ) is superficially similar to that within the highly influential multiprocess model of human memory by Atkinson and Shiffrin (1968, 1971). Atkinson and Shiffrin (1968) suggested how very short-term sensory memory, a general purpose working STM, and a LTM may be integrated. Sophisticated control structures, most of which are inadequately understood and specified from a mechanistic viewpoint, were posited to enable both selective transfer of information into their STM and extended retention of this information by rehearsal. In their model, working STM does not comprise simple associative strengths, but symbolic abstracted "chunks" of information, such as words or numbers. Although the ca-
constants are formed by appending three lower case characters to the end of the upper case mnemonic symbol of the variable with which the constant is associated. The standard suffixes acc, dep, and chg denote that the constant affects the accumulation, depletion, and change (both accumulation and depletion) rates of the associated variable respectively. The NMMM illustrated in Figure 3 behaves as follows: after STM is disturbed by a recent experience through an ANN element's learning rules, STM and LTM spontaneously interact in a manner that gradually makes them attain a common value somewhere in between each of their initial values. The relative rate of spontaneous change of each memory determines the extent to which each influences their final value. If separate accumulation and depletion terms are employed [as in eqn (2) for LTM], and they are assigned different values, then an increase in STM can lead to a greater or lesser change in final value than a decrease. This is the case in the computer results presented in Figure 4, RMPLITUDE
Bs
TIME
4min tile
RMPLITUDIE
aim
TIME
4nln
|ei
b FIGURE 4. Memory retention behavior exhibited by the NMMM, incorporating STM and LTM only. (a) Spontaneous regression behavior. Initially STM = 1.0 and LTM = 0.0, to simulate a relatively rapid experience-induced increase in STM from 0.0. ( b ) Spontaneous recovery behavior, initially STM = 0.0 and LTM = 1.0, to simulate a rapid experience-induced decline in STM from 1.0. STMchg = 0.0001, L T M a c c = 0.0002, and L T M d e p = 0.0001, from eqns ( 1 ) and ( 2 ) .
New Neural Multiprocess Memory Model
pacity of this type of STM is somewhat disputed, early empirical tests with humans indicated an STM capacity of between five and nine chunks (Miller, 1956). STM contents were thought to be automatically transferred into LTM, with effectiveness of transfer increasing with strength and duration of STM activity. In their model LTM was assumed to be permanent, and therefore free of passive decay. The background role of LTM and the exclusive use of STM in actual performance is common to both the NMMM and Atkinson and Shiffrin's (1968, 1971) multiprocess memory model. However, the NMMM is intended for very distributed and fine-grained operation, the mechanism of interaction that it employs between LTM and STM is specifically defined and different in nature, and the STM is as widespread as LTM, with each having the same capacity. In other words, the two memory models operate at distinctly different levels, with the Atkinson and Shiffrin multiprocess memory model being essentially high level, and the NMMM being distinctly low level. The initial NMMM version depicted in Figure 3 requires a substantial interval of time (in the order of minutes) to enable transfer of information from STM to LTM, or, in other words, for "consolidation" to occur, as illustrated in Figure 4. This aspect of the NMMM resembles Muller and Pilzecker's (1900) consolidation theory, and its attempt to explain the phenomenon of retrograde amnesia, in which a traumatic event can prevent memory of immediately preceding events. Muller and Pilzecker (1900) suggested that neural pathways must temporarily remain active after associations are initially formed for them to consolidate and become permanent. Support for the biological existence of a consolidation process comes from more recent results from experiments on more humble vertebrates and invertebrates, in which a clear distinction between STM and LTM is based upon differences in their time course and sensitivity to disruption (e.g., McGaugh, 1966, 1969; Chen et al., 1970; Young, 1970; Messenger, 1971; Riege & Cherkin, 1971; Sanders & Barlow, 1971; Menzel et al., 1974; Menzel, 1984). 4. ASSOCIATIVE MTM The basic NMMM version shown in Figure 3 may be considerably enhanced by incorporating the new adaptive associability mechanism that is described later below. This new mechanism obtains appropriate controlling signals from an associative medium-term memory (MTM), and it is primarily for this reason that MTM will now be considered. However, addition of MTM to the NMMM in itself also improves the extent to which the NMMM's spontaneous memory retention behaviour correlates with several additional types of empirical results, as discussed below. Figure 5 illustrates how MTM is integrated within the NMMM.
1357 +/-
L MemoryJ
Rules
Input .~ FIGURE 5. MTM-enhanoed NMMM, in which changes in LTM are now mediated by changes in MTM.
4.1. MTm-Enhanced NMMM Equations d STM = STMchg(LTM - STM)
(3)
d MTM = MTMacc[ STM - LTM - MTM ] + - MTMdep [ MTM + LTM - STM ] ÷ d L T M = LTMacc[MTM] +-LTMdep[-MTM] +
(4) (5)
Comparison of Figure 5 with Figure 3 indicates that the difference between STM and LTM value that was previously employed to directly modify LTM now modifies MTM instead, and MTM now drives changes in LTM. Although the basic roles of STM and LTM remain unaltered, MTM now requires time to accumulate towards the difference between the levels of STM and LTM. Because the rate of change of MTM is comparable to that of LTM, the effect of relatively rapid learning rule-induced change in STM upon LTM is delayed by the introduction of MTM. MTM only plays an active role during the transfer of information between STM and LTM. Unlike both STM and LTM, MTM has a steady-state value of zero. Figure 6 illustrates how simulated learning rule-induced changes in STM value produce an essentially inverted U-shaped deviation in the amplitude of MTM, whereas STM and LTM attempt to attain the same intermediate value. Because MTM is capable of neither the potentially indefinite retention of LTM nor the rapid learning rule-induced changes in STM, referring to it as "medium-term memory" seems appropriate--even though STM, MTM, and LTM all have comparable rates of spontaneous change. The addition of MTM to the NMMM significantly alters the course and extent of STM spontaneous regression and recovery, and provides a further opportunity to accentuate the asymmetrical response of the NMMM to changes in STM value by making MTMacc > MTMdep. The existence of such an asymmetry is suggested by the tendency for spontaneous recovery to be more pronounced than spontaneous regression (Mackintosh, 1974, p. 471 ).
1358
B. L. Rogers
Figure 6a illustrates how spontaneous regression of STM contents now exhibits a pronounced U-shaped response, and ultimately regresses only 10% (compared to 33% without M T M in Fig. 4a). In contrast, the spontaneous recovery illustrated in Figure 6b exhibits only a slight inverted U-shaped response more closely resembling the exponential shape of its counterpart without M T M ( Fig. 4b), and recovers 62% (compared to 50% in Fig. 4b). The U-shaped STM retention illustrated in Figure 6a results from the following sequence of events. Immediately after the artificially induced transition in STM level from zero to l, LTM is still at its original level of zero. Similarly, M T M has not had sufficient time to react to the difference between the values of STM and LTM, and therefore remains close to its steady-state value of zero. STM then starts to gradually decline towards the value of LTM, creating the left half of the U-shaped STM response. However, M T M and
then LTM also begin to accumulate in this period, so that after approximately 2 min, LTM and STM values cross due to MTM's limited rate of change and it role in driving change in LTM value. This corresponds to the instant at which STM value reaches its minima at the bottom of the U, and with LTM = STM, would constitute their new steady-state values were it not for the fact that MTM now requires time for its now significantly nonzero value to deplete back to zero. It is this nonzero MTM value that causes LTM value to continue to increase and overshoot STM value (i.e., make the LTM value cross that of STM). The larger LTM value then causes STM value to increase and in so doing to form the start of the right half of the Ushaped STM response. The LTM and STM values cross again as a result of the slow rate of change of MTM, producing the damped oscillatory STM and LTM behaviour evident upon close inspection of Figure 6a. The U-shaped STM behaviour of Figure 6a com-
RMPLITUDE
8s
lmln
58sec
TItlE
11mln
TIME
11mtn
a
RMPLITUDE
Bs
3 m l n Bmmc
b FIGURE 6. Memory retention behavior exhibited by the MTM-enhanoed NMMM, incorporating STM, MTM, and LTM. (a) U-shapecl STM spontaneous regression. Initially STM = 1.0 and LTM = 0.0, to simulate a rapid increase in STM. (b) STM spontaneous recovery.
Initially STM = 0.0 and LTM = 1.0, to simulate a rapid decline in STM. STMohg = 0.0001, MTMeoo = 0.0001, MTMclep = 0.00005, LTMacc = 0.0002, and LTMdep = 0.0001, from eqns ( 3 ) - ( 5 ) .
New Neural Multiprocess Memory Model
1359
no substantial change in STM would then occur. Although the time scale of Kamin's empirical observed U-shaped retention curve is approximately 30 times greater, its general shape also resembles that obtained from the MTM-enhanced NMMM (Figure 6a). The difference in time scale for these results may be attributable to the fact that different types of response systems were employed, leaving open the possibility that a common mechanism functioning at different rates may underlie both results. Also, Halstead and Rucker ( 1968, p. 40) cite data indicating that paramecia, which have electrical properties similar to those of neurons and are able to learn associative relationships, exhibit a U-shaped retention curve that reaches a local minimum four hours after training. As paramecia are single-celled creatures, this data also suggests that (at least) a three-stage multiprocess memory system may be incorporated within a single cell, and so provides additional support for the development of the NMMM within a single ANN element such as ACE. Addition of MTM into the NMMM may also produce new characteristics that accord with empirical resuits that were not initially considered. The minor STM overshoot depicted in Figure 6a for a single excitatory acquisition trial may be substantially accentuated by massing many excitatory acquisition trials [i.e., by using a short intertrial interval (ITI)] to produce a significant overshoot. The type of effect upon performance that such an overshoot in STM level would produce as acquisition approaches asymptote has indeed been observed when massed trials are used and rapid acquisition occurs (e.g., Lubow, Markman, & Allan, 1968). Furthermore, when MTM is used to drive changes in associability (as described below), this overshoot may
pares favourably with empirical results (Mercer & Menzel, 1982; Menzel, 1984) depicting the retention in honeybees of excitatory conditioning of colour and odour using food reinforcement (Figure 7). As previously described, STM level in the NMMM directly determines the current associative strength of a CS, with respect to the performance of a CR. Although caution should be exercised in comparing response measurements of a different type, a direct comparison of Figure 6a with Figure 7 indicates that the STM behaviour correlates with Menzel's (1984) data in the following respects: (a) the position of the bottom of the U, both in time and magnitude; (b) the maximum amplitude attained in the right half of the U; and possibly (c) the apparent presence of damped oscillations in the right half of the U. The very gradual loss of retention in honeybees, which is almost total after 10 days, may reflect a separate active extinction process, because freeflying bees produced this result. Alternatively, a gradual passive decay of LTM contents may be responsible, and this could easily be incorporated within the NMMM by adding an exponential decay term into eqn (5) with a relatively small valued rate constant. U-shaped retention has been observed in the avoidance responding of rats (Kamin, 1957, 1963). Kamin (1957) trained rats in one session of 25 learning trials then tested retention during a subsequent session of 25 relearning trials, and found that as the interval between sessions was increased from 0 min to 19 days, the number of avoidance responses made in the second session exhibited a pronounced deficit at an interval of 1 h, a result known as the Kamin effect. Kamin (1963) confirmed this result, and found that the deficit at the intersession interval of 1 h only occurs when learning is incomplete, as would be expected with the NMMM as 'ERBER'75"
~ '
MENZEL'68
e~,
80-
-80
70-
ERBER'?2
' ~
_ _ f/olb
ioo
.
Z' !
"
'V
.40
e~b/
,/"
•
\ 60-
50-
t( Sp lmin
10rain
Ih
Id
lOd
FIGURE 7. Retention of excitatory conditioning of free-flying bees trained to a colour (long dashes and closed circles) and fixed bees conditioned to an odour (short dashes and open circles, from Mercer & Menzoi, 1982). Note the log scale time axis. The outer ordinate is the percentage of correct choices for free-flying bees, the inner ordinate the responses (proboscis extension) o1 fixed bees after one conditioning trial. From Menzel (1984). Reprinted by permission.
1360
also be responsible for the overtraining reversal effect (Mackintosh, 1974, pp. 602-607). The extent of these effects depends upon both the particular values of the constants employed in the NMMM equations, and the specific learning rules implemented within the ANN element in which the NMMM is deployed. Menzel (1984) also notes how such U-shaped retention resembles the "so-called primacy and recency effects in human verbal learning, where it is found that items encountered first and last are better recalled than middle items (Weiskrantz 1970)" (p. 265). Perhaps such U-shaped retention in itself contributes to advantageous adaptive behaviour in a manner that is not yet fully appreciated. 5. ASSOCIABILITY LONG TERM MEMORY "Associability" is an established term that is used to refer to some empirically observed experience-dependent variations in the ease with which the effect of a CS may be altered by training(Mackintosh, 1973, 1974, 1975, 1983; Pearce & Hall, 1980). Organisms not only seem to learn how to respond to a CS, they also seem to learn how fast they should learn how to respond, or, in other words, how associable a CS is with a US. Note that associability, as referred to here, directly affects learning rate but not performance level. For example, some acquisition training schedules, such as partial reinforcement, may produce a CS with strong associative strength that elicits the performance of a strong CR, but which, with what is here interpreted to be a very low level of associability, is relatively unresponsive to subsequent extinction training. In this section, the NMMM is further extended to support a range of useful adaptive associability behaviour that correlates well with a wide body of empirical data. The NMMM's adaptive associability mechanism, its behaviour, and the set of empirical results that it addresses all differ markedly from previous adaptive associability models (Mackintosh, 1975; Pearce & Hall, 1980; Wagner, 1978). Rescorla and Wagner (1972) made allowance for accommodation of changes in the effectiveness of a CS due to CS-alone pretraining presentations, which have been shown to retard subsequent acquisition (i.e., latent inhibition, which is discussed below). However, Rescoda and Wagner (1972) did not formally specify how these changes would be supported, nor did they address other possible adaptive associability changes resulting from procedures other than CS-alone pretraining. Wagner (1978) later suggested that during acquisition training an association also develops between the experimental context and the CS, and that the discrepancy between this association and the asymptotic strength supported by the CS determines the associability of the CS, thus providing a formal account of latent inhibition effects.
B. L. Rogers
Mackintosh ( 1975 )suggested that the associability of a stimulus increases if it predicts reinforcement more accurately than all other experienced stimuli, and decreases if it predicts reinforcement less accurately. Pearce and Hall (1980) identify the suggestion by Mackintosh (1975) that the outcome of one trial may be used to determine the associability of a stimulus on the next trial as "the model's most important innovation" (p. 537 ), but reject Mackintosh's ( 1975 ) rules for changes in associability, and employ changes in associability to limit the acquisition of associative strength. In contrast to Mackintosh (1975), Pearce and Hall (1980) suggest that the associability of a CS declines when its consequences are accurately predicted. It should be pointed out that none of the above adaptive associability models operate in real time over intratrial intervals. In other words, unlike ACE's NMMM, they only seek to account for trial-to-trial variations, and do not take into account temporal characteristics of the CS and US that are also critical to conditioning (e.g., Mackintosh, 1974). Changes in the associability of a CS have been inferred from the results of latent inhibition and learned irrelevance procedures (Pearce & Hall, 1980; Mackintosh, 1983, pp. 222-236). It is also claimed here that the partial reinforcement effect (PRE), and the increased effectiveness with which subjects adjust to alternate sessions of fully reinforced and then fully nonreinforced massed trials (Mackintosh, 1974, p. 441 ), are also able to be mediated by a tingle adaptive associability mechanism. Furthermore, as shown below, this adaptive associability mechanism is capable of producing sigmoidal (S-shaped) or negatively accelerating acquisition curves, depending upon the associability value at the beginning of acquisition training in a manner consistent with empirical results (Spence, 1956; Kremer, 1971 ). The NMMM, with all the basic elements of this adaptive associability mechanism now integrated within it, is depicted in Figure 8. The apparently long retention capability of altered associability levels exhibited in the above-mentioned empirical behaviour suggests that some form of LTM storage is utilised. Therefore, within the NMMM a single.associability long-term memory (ALTM) is used to retain the current level of associability between an individual CS input and the US associated with the particular ANN element in which the NMMM is deployed. ALTM controls the level of associability within the NMMM by modulating the rate at which the learning rules modify STM contents (Figure 8 ). In Figure 8, a new memory-gating (short-term) memory (MGM) has been introduced to make MTM a more suitable source with which to drive changes in adaptive associability. As will be discussed below, when employed for this purpose, MTM contents should reflect the type of trial (s) recently experienced (reinforced
New Neural Multiprocess Memory Model
1361
L~--"~' --~+/.I ~ ~Term ~ a ] [~ 5[~r=~--~1~./.i ~soaabil~ ) Ter~ ~ LongTermi
,
! ,__J
! ! Input
x
---
Ou~ut
FIGURE 8. The NMMM now utilising MTM to produce behaviourslly appropriate adaptation of ALTM. The bipolar STM learning rules signal is now separated into two nonnegative signals, with only that which increases STM (corresponding to the effect of reinforcement) gating changes in ALTM. The two new operators towards the upper dght comer split the bipolar MTM signal into two separate nonnegative signals. Icons in the form of miniature input-output graphs indicate that the lower left-most operator passes only nonnegative Input values, whereas the upper right one passes only nonpositive values which are then converted into nonnegative output values of the same magnitude. The effect of positive MTM values upon ALTM is further regulated to limit the maximum attainable ALTM value.
or nonreinforced), without being substantially affected by the trial currently being experienced. The slow rate of change of MTM tends to satisfy the latter requirement, and the intermediary role MTM already plays in transferring STM changes to LTM means that it is already almost, but not quite, suitable for retaining an appropriate historyof trialsexperienced in the medium term. As indicated in Figure 6a, a record of a previous increment in STM is only retained in MTM for a few minutes, during which MTM actually goes temporarily negative. MGM is introduced in such a manner to sustain and stabilise MTM contents, without substantially affecting the spontaneous retention behaviour of LTM and STM. This is achieved by gating changes in MTM value with the value of MGM, which is increased by CS input activity and then passively decays to prevent MTM from so rapidly depleting back towards zero and then overshooting. To prevent the now more sustained nonzero MTM values from producing excessively large changes in LTM value, changes in LTM are now also gated by MGM. Also, to prevent MTM from remaining permanently at a nonzero value as a result of the previous trial, a simple passive decay mechanism is also implemented [ shown above MTM in Figure 8, and expressed as the last term in eqn (7) below] to enable residual MTM levels to gradually dissipate. The rate of this MTM decay is determined by the constant MTMchg. Equations (6)-(9) describe the spontaneous interaction between the NMMM's internal memories, with change in MTM and LTM now gated by MGM. It is sufficient for current purposes to assume that CS input activation initially increases MGM contents from zero to 1, so that modification of MTM and LTM is enabled (or gated) following CS input activity. (A specific example of one suitable means by which the CS input drives MGM is presented later below.)
5.1. MGM-Enhanced NMMM Equations d
STM = STMchg(LTM - STM)
(6)
d MTM = MTMacc[STM - LTM - MTM]+MGM - MTMdep[MTM + LTM - STM]+MGM - MTMchg MTM (7) d
LTM = LTMacc[MTM]+MGM - LTMdep[-MTM]+MGM (8)
d MGM = -MGMdep MGM dt
(9)
As shown in Figure %, this new memory configuration is still able to produce memory retention behaviour that compares favourably with that of honeybees (Figure 7), when the constants MTMacc and MTMdep are doubled to offset the effect of the decaying MGM value. However, MTM behaviour has now been modified so that it consistently maintains a trace of prior reinforcement (Figure 9a) or nonreinforcement (Figure 9b), which, although relatively small, is nonetheless available for hours after the last trial. This MTM behaviour is capable of supporting the establishment of a small but significant PRE at relatively long ITIs, as has been observed in numerous experiments cited by Mackintosh (1974, pp. 457--461 ). Also shown in Figure 9 are the MGM traces, which after simulated initial energisation by the CS Input, quickly decay towards zero. Although MGM was introduced to make MTM a more suitable source with which to drive changes in adaptive associability, its introduction also had the important side effect of isolating MTM and LTM from STM when the CS input has not been active recently, which is normally most of the time. This makes STM
1362
B. L. Rogers RMPLITUDE
Ss
TIME
1 6 m t n 4Bs
TIME
lGIIn
AMPLITUDE
II
4ihl
b FIGURE 9. Spontaneous memory retention behavior exhibited by the MTM-enhanced NMMM, with change in MTM and LTM now gated by MGM. (a) Spontaneous regression behavior. (b) Spontaneous recovery behavior. STMchg = 0.0001, MTMacc = 0.0002, MTMdep = 0.0001, LTMecc = 0.0002, LTMdep = 0.0001, and MGMdep = 0.00005, from eqns ( 6 ) - ( 9 ) .
also available for use as a temporally sensitive and relatively nonassociative register of ongoing US availability, the advantages of which are discussed at greater length below. To more fully indicate the behaviour supported by the NMMM, it is now necessary to integrate the NMMM within ACE. 6. THE ASSOCIATIVE CONDITIONING ELEMENT (ACE) ACE (Rogers, 1991, 1993b) is a relatively extreme example of a growing breed of new A N N elements that are departing from the established ANN view that neuronal models should be simple, and that complex behaviour should somehow emerge from the interaction of many simple elements (Sutton & Barto, 1981 ). ACE produces a wider breadth of classical conditioning, memory, and timing behaviour than existing computational models (Rescorla & Wagner, 1972; Sutton & Barto, 1981; Barto & Sutton, 1985; Gluck & Thompson, 1987; Klopf, 1987; Desmond & Moore, 1988; Ke-
hoe, 1988; Grossberg & Schmajuk, 1989), and ACE's behaviour more favourably compares with empirical results for each type ofbehaviour that it produces than does that produced by these other models (Rogers, 1991 ). Despite this, ACE still provides a simplified, limited, and specialised account of the type of empirical behaviour with which it has been compared and assessed, and other more restricted models and theories address individual aspects of conditioning behaviour not accounted for by ACE. The internal architecture of ACE is illustrated in Figure 10. Figure 10 is dominated by an individual representative CS input channel, one of which exists for every CS input to ACE. In overview, each CS input channel comprises the CSTC (Rogers, 1993a), which maintains a stimulus trace of previously activated CS inputs, the NMMM (Figure 8), which temporally regulates the impact of current experience upon future performance, and the remaining mechanisms that are associated with a new set of learning rules (Rogers, 1991, Chap. 7). Also shown in Figure 10 is the common output stage
New Neural Multiprocess Memory Model
1363
T ~ l..mo
r,,.,o,or l /merit-Gating/ Memory
T+
Input .~
Long l+/- t Term ~
..mo j
Term
f Mediu_m~ Term I
-
I
I-
~+emory I~ I /
An IndividualCSInputChannel
~
I CS."i'r-.a~_I OST..J"M~
~#
~Aa~ial~lity)
LongTerm|
r-~"~ t'~--J ~ T + / - I
-
~-
,[
I
1
i-~-~L.,,.t~+
,-~ ]
+i
~
..mo
j
I
T
|
Input
I
Output
FIGURE 10. The associative conditioning element (ACE), comprising multiple CS input channels (only one of which is shown) that incorporate the NMMM and converge upon the common output stage.
upon which the CS input channels and the single US input converge. ACE's output signal indicates expectation of activation of the US input by one or more previously activated CS inputs, and contains the amplitude and temporal information necessary for generation of a CR. However, the output does not in itself qualitatively define the nature of the CR, because ACE forms CS-US associations [in accord with Mackintosh ( 1983, pp. 1819) and Le Doux (1990), and as discussed in Rogers (1991, Chap. 3 )]. ACE's CS expects US output signal is also suitable as a source of conditioned reinforcement for higher-order and instrumental conditioning (Mackintosh, 1983; Rogers, 1991, Chap. 3). The US input does not activate ACE's output, but instead acts to associatively reinforce (and nonassociatively reinstate) the CS input channels. The output stage also incorporates US onset (short-term) memory (USOM) to make the reinforcing effect of US presentation vary with US duration in accord with empirical results (Ashton, Bitgood, & Moore, 1969), and place the most emphasis upon US onset. The new learning rules developed within ACE may be described as a substantially modified intratrial version of the Rescorla-Wagner model (Rescorla & Wagner, 1972), in which expectation of reinforcement, and actual reinforcement subsequently experienced are compared within each learning trial to determine longterm change in associative strength (and in the case of ACE also change in associability, based upon patterns of experience over many successive trials). ACE selectively exhibits the following advantageous behaviour associated with the Rescorla-Wagner model: stimulus amplitude effects, acquisition of conditioned excitation and inhibition, extinction of conditioned excitors (but not inhibitors) when they are presented without reinforcement, overshadowing, compound conditioning effects (e.g., blocking and unblocking), and discrimi-
native stimulus effects. However, the temporal isolation between STM and LTM that MGM provides also enables ACE's new learning rules to employ synaptic STM to support the nonassociative phenomena of habituation, secondary extinction (Pavlov, 1927, pp. 54-56), and reinstatement (Pavlov, 1927, p. 59; Konorski, 1948, p. 185; Rescorla, 1979). In other words, ACE uses STM to capture a sense of the prevailing temporal context at all times, and not just briefly after CS presentation, whereas MGM prevents this experience from being automatically transmitted to LTM. However, ACE may then employ such recent experience to produce appropriate LTM changes if the associated CS is presented within (and in particular shortly after) this temporal context. For example, this mechanism is able to support the acquisition of a conditioned inhibitor by a successive discrimination differential conditioning procedure (Mackintosh, 1974, p. 34), without relying upon conditioned excitation to background stimuli (Rogers, 199 l, Chap. 8 ) which, as Mackintosh (1983, p. 188) points out, is unlikely to account for all of the conditioned inhibition established in animal experiments. ACE's implementation of associative conditioning as an extension of nonassociative phenomena also bears some general similarity to the suggested relationship between the mechanisms underlying presynaptic facilitation and classical associative conditioning in Aplysia (Hawkins et al., 1983; Hawkins & Kandel, 1984). If positive MTM values were simply gated by reinforcement to produce increases in ALTM, then ALTM value would increase during each acquisition session of every acquisition-extinction pair of sessions (as discussed further below) without ever reaching a maximum limit. Moreover, increases in ALTM enable larger increases in STM, which produce larger MTM values, which in turn enable progressively larger increases in ALTM. Consequently, a special effort needs to be made
1364
B. L. Rogers
to contend with the positive feedback relationship that exists between increases in ALTM and increases in STM. As illustrated in Figure 8 (between MTM and ALTM), a special negative feedback loop reduces the extent of the increase in ALTM as the reinforcement signal (which is modulated by ALTM) becomes increasingly larger, when MTM value is positive. Although it might be simpler to directly use the difference between some nominal asymptotic ALTM limit and ALTM value to limit ALTM's growth, the observation that a stronger US increases resistance to extinction (i.e., decreases ALTM) of conditioned suppression after all subjects have reached asymptotic levels of suppression in acquisition (Annau & Kamin, 1961 ) suggests that the reinforcement signal should be used for this purpose. When the reinforcement signal is of unity amplitude, no additional increase in ALTM is possible. If the reinforcement signal exceeds unity amplitude, a negative increase (i.e., a decrease) in ALTM results, because the signal that normally acts to increase ALTM now attains a negative value. That this condition is permitted is indicated in Figure 8 by the " + / - " label on the signal as it contacts the ALTM block. This additional capacity to decrease ALTM level back down to a new lower maximum asymptotic level permits the maintenance of appropriate maximum ALTM levels, even if arbitrary changes in the effectiveness of reinforcement occur. Furthermore, that this can occur when only positive MTM values are present, which for example is the case during acquisition, ensures that appropriate adjustment of the maximum ALTM level will occur even during conditions that normally only increase ALTM level. The CSTC shown in Figure 10 itself comprises four specialised memories, and has sufficient complexity to accurately model the mean topography of the rabbit's nictitating membrane response, and the way in which it changes when subject to variations in CS duration and amplitude (Rogers, 1993a). Space limitations preclude any internal explanation of the CSTC here, and such details are presented in Rogers (1993a). The trace of prior CS input activation generated by the CSTC is employed to support the acquisition of a CR that is timed to peak approximately at the time reinforcement is expected, the production of anticipatory CRs, and the implementation of trace conditioning. 7. CONDITIONS FOR COMPUTER SIMULATION OF ACE The mechanistic detail of ACE schematically depicted in Figure 10 is fully specified by the set of differential equations and the values of the associated constants provided in the Appendix. All subsequently presented computer simulation results in this article were obtained using only this single set of equations, with all associated constants remaining unchanged. Further-
more, eqns ( 6 ) - (9) presented above, and the values of the associated constants employed in Figure 9, were directly incorporated into the set of equations provided in the Appendix. Consequently, the computer simulation results presented in Figure 9 remain valid for the NMMM when it is employed within ACE. All the results presented here were generated using CS and US inputs of unity amplitude and 50-ms duration, primarily to facilitate comparison between computer simulation results. Although the absolute amplitude of these neuronal inputs cannot be directly related to particular empirical stimulus intensities, the 50-ms duration of the inputs directly equates with that frequently employed in many empirical procedures using the NMR preparation (e.g., Smith, 1968; Smith, Coleman, & Gormezano, 1969; Gormezano, Kehoe, & Marshall, 1983; Gormezano, 1984). Because the ISI for all results generated here is greater than the 50-ms CS duration employed, all conditioning is of the trace type, rather than the other main alternative, delayed conditioning. Trace conditioning is commonly employed in experiments using the NMR preparation (Gormezano et al., 1983 ), and highlights the role played by the CSTC in generating a sustained stimulus trace within ACE, because it is this stimulus trace (the CST signal) that is associated directly with the US and not the CS input signal. The ITI used throughout this paper was fixed at l 0 s, which is substantially less than the mean 60 s ITI frequently employed in NMR preparations (e.g., Smith, 1968; Smith et al., 1969; Gibbs, Latham, & Gormezano, 1978). The 10-s ITI is adequately long for the generation of computer simulation results using ACE as it is well beyond the maximum ISI capable of producing significant simple first-order conditioning in either ACE (Rogers, 1991, Chap. 8 ) or the NMR preparation (Schneiderman & Gormezano, 1964; Schneiderman, 1966; Meredith & Schneiderman, 1967; Vandercar, & Schneiderman, 1967; Smith, 1968; Smith et al., 1969; Gormezano, 1972; Schneiderman, 1972; Kehoe, 1979; Gormezano, 1984). Gormezano (1984) notes that "over a broad range of parameters, our NMR preparation has consistently revealed a tightly bound range of CS-UCS intervals for conditioning with maxima between 200 and 400 msec and levels of responding declining to negligible levels as the ISI approaches 34 sec" (p. 14). Although CSI-CS2-US serial compound conditioning of the NMR has been effective over CS1-US intervals in the order of 18 s (e.g., Kehoe et al., 1979; Kehoe, Marshall-Goodell, & Gormezano, 1987), the CS2-US interval is usually maintained within the optimal 200-400-ms range of the single CS contiguity gradient, and the CS 1-US conditioning over longer intervals is apparently mediated by higher-order or sensory conditioning (Kehoe et al., 1979), which are outside the scope of the temporal aspects of the first-order conditioning behaviour addressed here. The 10-s ITI is also long enough to enable the short-
New Neural Multiprocess Memory Model term changes within ACE to approach their new steadystate values by the beginning of each current trial, after having been disrupted by the previous trial. It is important to note that the spontaneous medium-term changes in associative STM, MTM, LTM, and M G M require several minutes to approach completion (Figure 9), and that using the relatively short ITI of 10 s accentuates the medium-term effects produced by these cumulative quantities. This is significant because these medium-term changes are employed to produce appropriate changes in the new adaptive associability within the NMMM, which is a major feature of ACE. Difference equation versions of all system differential equations are updated at what corresponds to every 10 ms (100 times per second) to provide sufficient temporal resolution for the most rapid rates of change that occur within ACE, and each training session consists of 100 or more trials. Also, concerning the decision to employ the relatively brief ITI of t0 s rather than the more usual 60 s, the apparent lack of relevant NMR retention data led to the decision to use data from honeybees (Figure 7) to calibrate the spontaneous medium-term rates of change of STM, MTM, LTM, and M G M (Figure 9). Hence, adherence to a 60-s ITI would not necessarily produce a close correspondence between simulated ACE and empirical N M R results. Within the procedures for animal experiments it is common to introduce a randomly sequenced variation in ITI about its mean value, and to pseudorandomly sequence reinforced (R) and nonreinforced (N) trials in partial reinforcement (PR) schedules (e.g., Gibbs, Latham, & Gormezano, 1978). However, because ACE is being tested in isolation, and it alone has no capability to acquire or produce an output response from the current trial in expectation of the outcome in the next trial when the ITI is more than a few seconds, it was not considered necessary to introduce either type of random variation in procedure for the production of the computer simulation results. Those results presented herein that were conducted over many successive training trials are sampled one time state ( 10 ms) before CS onset at the beginning of each trial, to more clearly display overall trends and to indicate the cumulative quantity values as they will affect performance in the next trial. However, it should be noted that in some cases a significant variation in the amplitude of a cumulative quantity occurs within the early stages of each trial, even when the overall trend is flat and smooth. A special development environment was written in Microsoft C language to facilitate sequencing of four CS inputs, the single US input, the selective display and logging of the 4 . l0 + 1 = 41 cumulative quantities and the 4 . 7 + 4 = 32 signals, and management of the 25 system constants, associated with the simulation of an individual ACE with four CS input channels (only one of which is required for generation of the results
1365 presented in this article). The system difference equations were also implemented in C using floating point arithmetic, with floating point variables used for all system signals, variables, and constants. 8. ACQUISITION CURVE SHAPES Psychologists developing models of conditioning sometimes examine the course of acquisition when seeking clues as to the nature of the mechanisms underlying associative learning. Although different response systems and response measurement techniques produce marked variations in rate, and to a lesser extent course of acquisition (Mackintosh, 1974, p. 10), researchers typically model either a negatively accelerated acquisition curve shape (e.g., Rescorla & Wagner, 1972; Barto & Sutton, 1985), or a sigmoidal curve, with an initial acceleration phase followed by deceleration to an asymptotic maximum level of performance (e.g., Klopf, 1987). [Pearce and Hall (1980) model a negatively accelerated acquisition curve, but suggest that this may be translated into a sigmoidal strength of response performance curve by the action of some threshold that associative strength must exceed.] However, it is not uncommon for some individual subjects within the same experimental procedure to exhibit sigmoidal acquisition while other individuals do not. Figure I l shows the averaged acquisition curves in eye-blink experiments for three groups of human subjects with similar general performance levels, referred to by Spence (1956, p. 60) as homogeneous groups. Spence ( 1956, p. 64) states that: Examination of a fairly large number of such "homogeneous" frequency curves of eyelid conditioning obtained in our laboratory has shown that when the performance level of the subjects at the beginning of conditioning is relatively low and/ or the rate of conditioning is slow a curve with an initial period of positive acceleration is invariably obtained. On the other hand, if the performance level is relatively high at the start of conditioning and/or the rate of conditioning is high the curve is typically negatively accelerated throughout, especially if the grouping of trials is coarse. Although Figure 11 indicates that different individuals may or may not exhibit an initial phase of accelerated acquisition, it does not reveal whether this difference in learning performance is in some way inherent in each individual, or whether it is affected by some particular aspect of an individual's previous experience. Kremer ( 1971 ) found that although a nonpreexposed control group exhibited a negatively accelerated acquisition curve, a CS-alone preexposed (i.e., latent inhibition) group exhibited an initial phase of positive acceleration. Kremer's ( 1971 ) results suggest that previous experience is capable of determining the course of subsequent acquisition, and not just its extent or rate of change. Although these aspects are beginning to be
1366
B. L. Rogers ~ N , 9 ~r-TRIALS 1"40: 21-30
,-~SO /
(n 80 if) Z
/
-
TR,At.S~80:~,-40
7o
(n / z
2
e/
~
/
TRIALS1-40: 5-13 TRIALS41"80:17-30
50
/
/
/
Z 0
~r.o'~'''-'N" 8
/
o
Z
w
10
Ilg w
a,
0
.... (
i
2
3
,
.
4
5
.
.
6
.
.
.
7
.
.
a
9
i0
8LOGKSOFTENTRIALS FIGURE11. Frequency curves of classical eyelid conditioning for three groups of homogeneous human subjects. The uppermost curve is that for subjects exhibiting high conditioning performance both eady and late in training, the middle curve for subjects exhibiting average performance, and the lowest curve for subjects exhibiting poor performance. From Spence (1956,p. 66), which was based on date from Spence & Taylor (1951). Reprinted by permission. addressed by some computational models (e.g., Kehoe, 1988), the NMMM addresses them in a new and more comprehensive way. Figure 12a illustrates the progressive change in STM within ACE's NMMM during massed excitatory acquisition over 100 training trials, for three different initial ALTM values. The acquisition curve shape for ACE is strictly negatively accelerated, mildly sigmoidal, and strongly sigmoidal when the initial ALTM value is high, medium, and low, respectively. These three curves compare favourably with those reproduced in Figure 11 from Spence ( 1956, p. 66), whereas the relative levels of associability relate directly to Spence's ( 1956, p. 64) descriptive account of rates of conditioning quoted above. Furthermore, because the initial ALTM level is dependent upon previous experience in a manner consistent with latent inhibition and learned irrelevance procedures (as will be discussed below), the results depicted in Figure 12a are also consistent with the results obtained by Kremer (1971 ) when allowance is made for the different dependent variables used by Kremer. The means by which ACE produces the different STM acquisition curve shapes illustrated in Figure 12a may be explained as follows. Each CS presentation activates the CS input to the single CS input channel used here, producing a CS trace (CST) signal that, with a CSTC rate of change (ROC) setting of 0.1, progressively increases before peaking approximately 300 ms after CS onset and then gradually declining. Rogers (1993a) provides a complete explanation of CSTC operation. After some amplitude compression and rate-of-increase limiting by reinforcement-gating (short-term) memory
(RGM), this trace signal enables the subsequent US presentation 250 ms after CS onset to produce a significant increase in STM value. The trace signal also increases MGM, which enables the induced increase in STM to be gradually transferred to LTM through MTM. Because of its slow rate of increase, MTM attains progressively larger positive values with each of the early training trials. These progressively larger MTM values are also used to increase ALTM towards its adaptively regulated maximum value, upon each reinforcing US presentation. IfALTM is already at its maximum attainable level at the beginning of acquisition (Figure 12, heavy shading), then acquisition proceeds at a fast rate from the very first trial, but ALTM cannot be increased further as acquisition proceeds. Because ACE implements a modified intratrial version of the Rescoda-Wagner model (Rescoda & Wagner, 1972), and because ALTM level remains essentially fixed, a negatively accelerated acquisition curve is produced as the difference between expected and delivered reinforcement diminishes with ongoing acquisition training. The negatively accelerating acquisition curve (Figure 12a, heavy shading) is not quite the exponential shape produced by the Rescoda-Wagner model (1972), due to the stabilising influence of LTM upon STM during acquisition within the NMMM. If, in contrast, ALTM is initially low (Figure 12, light shading), then there exists considerable scope for subsequent increases in ALTM, but the rate of acquisition of an excitatory CS will initially be very slow. As MTM value gradually increases during the early stages of acquisition, it is able to produce progressively larger increases in ALTM. The very frequent and effective delivery of reinforcement in the massed excitatory acquisition training repeatedly gates the increasingly large positive MTM values through to increase ALTM. This enables the strictly negatively accelerated acquisition curve otherwise produced (Figure 12a, heavy shading) to be supplemented by an initial phase of slow learning that then accelerates, with the result that a sigmoidalshaped acquisition curve (Figure 12a, light shading) similar to that commonly empirically observed (e.g., Spence, 1956, Chap. 3) is obtained. Note, however, that for the NMMM to produce a sigmoidal acquisition curve, the ITI in acquisition needs to be short enough to produce an increase in ALTM of sufficient amplitude and speed to produce the initial phase of accelerating rate of acquisition, and the initial ALTM value needs to be sufficiently small. When the associative strength of the CS attains the level supported by the continuous reinforcement, further sustained STM and LTM increase is not possible. MTM value then declines and prevents further increases in ALTM. Hence, the extent to which the associability of a CS can be increased during its massed acquisition may be less than that required to make ALTM approach its adaptively regulated maximum
New Neural Multiprocess Memory Model
1367
STfl NIPLITUIIE
•
TRIALS
IS8
fiLTH AHPLITUDE
8
TR I ALS
t
DO
b FIGURE 12. Massed excitatory acquisition of one CS input channel over 100 trainin 0 trials at an ISl of 250 ms, when the initial ALTM level = 0.72 {heavy shading), 0.10 {medium shading), and 0.03 (light shading), (a) STM levels, all initially at 0.0 value. ( b ) ColTospondin0 ALTM levels, starting at the three different initial values,
value, particularly when long ITIs are employed. The ALTM curves of Figure 12b have no such trouble attaining their full maximum value, largely because of the relatively short 10-s ITI employed here. If the initial associability of all CSs is set to a low level, then this will help to ensure that other CSs known to correlate well with reinforcement have a relative advantage in acquisition, and that any coincidental pairings of reinforcement with a naive CS lead only to a modest degree of conditioning• Only when a firm correlation between CS and US has been established by previous experience is the acquisition of associative strength allowed to proceed more rapidly. 9. THE PARTIAL REINFORCEMENT EFFECT The most appropriate learning rate for a randomly sequenced schedule of partial reinforcement, in which the type of the current trial (R or N) in no way predicts the type of the next trial, would be a relatively slow
learning rate. This is because it would enable the subject to integrate the outcomes of many trials, and then produce a consistent CR with a strength and/or probability being some monotonically increasing function of the probability of reinforcement. With changes in ALTM enabled by the reinforcing effects of every R trial and the direction and extent of ALTM change determined by those trials preceding each R trial, particularly when shorter ITI's are employed (Grosslight & Radlow, 1956, 1957; Spivey & Hess, 1968; Capaldi & Kassover, 1970; Mackintosh & Little, 1970), the NMMM's adaptive associability mechanism (Figure 8) will (as demonstrated below) produce a much lower ALTM value for this type of random schedule than for the continuously reinforced schedule used to produce Figure 12. The extent of the difference in ALTM values resulting from such a random, and the previous continuous, reinforcement schedule is dependent upon MTM levels being essentially independent of each currently experienced R trial, so that an N-R sequence tends to result in a distinct
1368
decrease in ALTM, while an R-R sequence tends to produce a distinct increase. The random reinforcement schedule is commonly referred to as a partial reinforcement (PR)schedule, because only some of the acquisition trials are reinforced. It is well established that a PR schedule may reduce substantially the rate of subsequent extinction during extinction training so that it is markedly slower than that subsequent to continuously reinforced acquisition (Mackintosh, 1974, pp. 434-467). This effect is known as the partial reinforcement effect (PRE), and will also result from operation of the NMMM's adaptive associability mechanism because the reduced associability level affects both extinction and acquisition learning rates--in fact, all experience-induced changes in STM value. Empirical results are reproduced below from two experiments conducted by Gibbs et al. (1978). These experiments assess the effects of PR schedules on the maintenance and resistance to extinction of the rabbits' NMR. Both experiments consisted of three stages. In stage I all subjects received continuously reinforced acquisition training and achieved a high level of responding. Stage II consisted of different types of PR schedules for each group of subjects, and revealed the extent to which responding was maintained under each schedule. Stage III consisted entirely of CS-alone extinction training for all subjects, and revealed resistance to extinction and the extent of any PRE. In Experiment 1, six groups of subjects were employed. In Stage II of Experiment 1, groups were subjected to a PR schedule with a probability of reinforcement of 100%, 50%, 25%, 15%, 5%, or 0%. Gibbs et al. (1978) refer to each group using a "C", followed by the corresponding percentage probability of reinforcement for that group. The mean percentage of CRs for each group over all three stages of Experiment 1 is reproduced in Figure 13. After reaching comparably high levels of responding in stage I, the performance maintained by each group in stage II was a monotonically increasing function of the probability of reinforcement. For groups C I00, C 50, C 25, and even C 15, the level of responding appears to drop rapidly by a small amount within the first block of trials, and then remain relatively constant for the rest of stage II. Although group C 5 shows a definite progressive decline in responding, it is markedly better than that of the extinction control group C 0. In Stage III groups C 50 and C 25 produce the highest response performance, and as they are higher than the C 100 group, this indicates the presence of a substantial PRE. Computer simulation results of ACE's operation when subject to a similar three-stage procedure to that employed by Gibbs et al. (1978) in Experiment 1 are shown in Figure 14. Although it should be noted that there exist differences in experimental procedure and response measurement, the results illustrated in Figure
B. L. Rogers IQo
STAGE I n °""'l~/
go
0 • 0 • 0
~e~•
,°\o\
&
U
STAGE III
STAGE II
60
C.lO0 C.50 C'25 Ct5 C',~
Z
UL*
40
!
20
I
~e e~e
.~.e ~
u
a
I
2 -3
e.....~e I
1
I
2
I
3
I
4
I
S ~, i
2
a
IO0-TRIAL BLOCKS FIGURE 13. Percentage CRs as a function of 100-trial blocks of training in stages I, II, and III. Stage I is a fully reinforced acquisition period. Stage II consists of partial reinforcement schedules of 100% (C 100), 50% (C 50), 25% (C 25), 15% (C 15), 5% (C 5), and 0% (C 0). Stage III is a period of extinction training. From Gibbs et al. ( 1978, Experiment 1 ). Reprinted by permission.
14a compare favourably with the empirical results reproduced in Figure 13, concerning the main observations described above. After starting from the same conditions following continuously reinforced acquisition that is analogous to stage I, the response strength of a single CS input channel when subject to similar PR schedules exhibits a similar rapid decline to a marginally lower level, and then a more gradual decline throughout the partially reinforced stage that is analogous to stage II. In the extinction stage (analogous to stage III), all PR schedules, and in particular the 50% and 25% ones, soon produce stronger responses than the fully reinforced group, illustrating a strong PRE. The graph of ALTM values for a single CS input channel separately trained on each PR schedule shown in Figure 14b provides an additional insight into the means by which ACE produces a PRE. As indicated in Figure 10, changes in ALTM are gated by the enable associability change (EAC) signal, which only attains positive nonzero values when US presentation occurs while the CST signal is still above zero following CS presentation. In other words, EAC only enables changes in ALTM when a CS is reinforced by subsequent US presentation. The ALTM level of the completely nonreinforced CS input channel, labelled 0% in Figure 14b, therefore remains fixed at the elevated level produced by prior continuously reinforced massed excitatory acquisition, even though the declining STM values produce substantially negative MTM values. However, large negative MTM values are able to substantially reduce the ALTM levels of a CS input channel that is partially reinforced each time it experiences a reinforced trial. The extent of the decline in ALTM value increases with the number of preceding nonreinforced trials. Hence, even though a CS input
New Neural Multiprocess Memory Model STM RnPLITUOE B.?B
<< P a r t i a l
1369
RatnForceaant i Extinction
ts~g
I
)>
Og
.SO 9
?RIALS
299
a
RLTn RnPLITUDE
<< P a r t i a l
RelnFor'caaant
[
Extinction
>>
zsz TR I ~ S
a
2B8
b FIGURE 14. Massed training of one CS input channel, previously continuously reinforced for 400 trials, then subject to each of six different partial reinforcement schedules over 100 trials, then 100 extinction trials. Partial reinforcement schedules of 100%, 50%, 25%, 10%, 5%, and 0% were used. ROC = 0.1, ISI = 250 ms. (a) STM levels. (b) ALTM levels.
channel receiving a lower probability of reinforcement experiences fewer reinforced trials, the extent of the decrease in ALTM occurring with each reinforced trial tends to be greater. Consequently, the partially reinforced schedules tend to produce similarly low ALTM values during the PR stage, as illustrated in Figure 14. In the final extinction stage, all ALTM levels remain unaltered due to the complete absence of reinforcing US presentations, which ensures that the resulting PRE remains effective throughout extinction. Subsequent reacquisition of the partially reinforced CS would therefore also be expected to occur at a reduced rate-a prediction for which empirical evidence may readily be sought. W i t h all PR schedules producing similarly low ALTM values, the extinction rates they produce in Figure 14a are so much slower than that following continuous reinforcement that it is difficult to tell which probability of reinforcement produces the greatest resistance to extinction, particularly because they begin
at different rates after the PR stage. Figure 14b is of some assistance in this regard, because it clearly indicates that the 25% PR schedule produces the lowest ALTM value, and the 50%, 10%, and 5% PR schedules produce progressively higher ALTM values. This compares well with the 25%, 50%, 15%, and 5% order obtained by Gibbs et al. ( 1978, Fig. 1) when their groups are sorted in order of decreasing average percentage of CRs produced over the entire extinction stage of Experiment I. The sharp transitions in each STM curve of Figure 14a are produced by the localised temporal effect of R and N trials. In contrast, Gibbs et al. (1978) found no evidence of such temporally localised reinforcementdependent changes in response frequency for each individual subject. Their empirical results suggest that the direct relationship between STM and response strength used within ACE may need to be supplemented by adding new cumulative quantities to temporally buffer the effect of changes to STM upon response per-
1370
formance. However, this possibility has not yet been explored. The apparent absence of decrements in CR maintenance that are able to be localised to the immediate effects of reinforcement in Experiment 1 led Gibbs et al. (1978) to consider more generalised effects of nonreinforcement. More specifically, they considered stimulus generalisation hypotheses, for which they cited Sheffield (1949) and Capaldi (1966) as examples. With the intention of assessing the possible implications of such stimulus generalisation hypotheses, they devised and conducted a second experiment in which a group of subjects received progressively lower probabilities of reinforcement in stage II, thus avoiding the abrupt transition between continuous and partial reinforcement schedules experienced in Experiment 1 between stages I and II. In Experiment 2, Gibbs et al. (1978) employed three groups of subjects. Group C (continuous reinforcement) was almost continuously reinforced with 95% probability in stage II, and group E (extinction) received CS-alone trials only. Group P (partial reinforcement) was reinforced with probabilities of 85%, 75%, 65%, 55%, 45%, 35%, 25%, 15%, and then 5%, for successive sessions of 60 trials in stage II. The mean percentages of CRs for each group are reproduced in Figure 15. The response probability of group P remained remarkably high during stage II, being indistinguishably close to that of group C until the last third of stage II, as the probability of reinforcement approached the low value of 5%. In stage III, in which all three groups received CS-alone extinction trials, the response performance of group P converged and overlapped with that of group C while still at moderate performance levels. Because the response probability of group C fell from a distinctly higher level at the end of stage II than did that of group P, this result also suggests a significant PRE, though Mackintosh ( 1974, p. 73) points out that only a clear crossover of performance levels would unequivocally demonstrate a PRE in such a case. A three-stage procedure similar to that employed by Gibbs et al. (1978) in Experiment 2 was devised for ACE, for which computer simulation results are shown in Figure 16. The specific sequence of R and N trials used in the PR stage (stage II) of this simulation is specified in Table I. The trend of the new STM curve shown shaded in Figure 16a compares favourably with that of group P in Figure 15 during stage II of each procedure. In stage III, in which CS-alone trials are experienced, the computer simulation results of Figure 16 also reveal the expected PRE. As with the previous experiment (Figures 13 and 14), the computer-simulated PRE in Figure 16 is of greater prominence than that exhibited by the empirical results reproduced in Figure 15. This may result primarily from the use of the relatively short 10-s ITI in generating the computer simulation results, which tends to accentuate changes in associability.
B. L. Rogers STAGE I ,
STAGE II
'l
a
~ 2
3
a
~ 2
a 4
5
,
STAGE III
E
6
7
a
9
1 2
3 4
s
120-TRIAL BLOCKS
FIGURE 15. Percentage CRs as a function of 2-day blocks of 120 trials in stages I, II, and III. S t a g e I is a fully reinforced acquisition period. Stage II consists of a progressively reducing partial reinforcement schedule of 85%, 75%, 65%, 65%, 45%, 35%, 25%, 15%, a n d 5%, for each successive 2-day block of trials (group P). S t a g e III is a period of extinction training. Also shown are the percentage CRs for 96% (group C), and 0 % (group E) reinforcement schedules during stage II. From Gibbs et al. (1978, Experiment 2). Reprinted by permission.
Comparison between Figure 16 and Figure 14 indicates that the progressively declining PR schedule achieves an ALTM level as low as that produced by the fixed 10% PR schedule, but with a response strength at the end of stage II as high as that produced by the fixed 25% PR schedule. In other words, ACE adapts more favourably to a gradually declining PR schedule than to an abrupt change to one with a much lower probability of reinforcement. The extent of the similarity between the trends of the computer simulation results and the empirical results in both Experiments 1 and 2 of Gibbs et al. (1978) indicate that ACE is capable of accounting for most aspects of the PRE empirically observed in stage III, and of the characteristic maintenance of response in stage II. Although it may be possible to interpret the internal operation of ACE in terms of some form of stimulus generalisation hypothesis (e.g., Sheffield, 1949; Capaldi, 1966), ACE provides a more specific mechanistic account of the results produced by Gibbs et al. (1978) than such hypotheses. 10. ALTERNATE ACQUISITION-EXTINCTION SESSIONS If subjects are trained with alternating sessions of fully reinforced acquisition trials and then extinction trials, then it has been found that the rate of both reacquisition and reextinction progressively increases (e.g., Bullock & Smith, 1953; Gonzalez, Holmes, & Bitterman, 1967; Davenport, 1969). It is suggested here that such training produces an increase in associability, and that this accounts for the resultant acceleration in learning. Under these training conditions, the current type of trial (N or R) indicates with high reliability that the next trial will be of the same type, and intuitively, the most appropriate learning rate would be a relatively
New Neural Multiprocess Memory Model STI1 RilPL I TUnE " ~
<<
art
1371
a
Rm n o r ' c e m e n t
Ex I n c
ls~3.
n
o n >>
TRIRLS
;~BO
RLTM R n P L I TUnE
~ii~i~!ii;~iii!ii~iiii~ii~ii~ii~(~ii~i~ ........i: ......i~::::~i!~i:iiiiiiiil;:iiiiiiiiiiii;iiiiiiii!iiii!iiiiiiiiiiiiiiiiiii n
TR 1RLS
2Be
b FIGURE 16. Massed training of one CS Input Channel (shaded), previously continuously reinforced for 400 tdals, over 100 partial reinforcement tdals, then 100 extinction trials, when partial reinforcement is progressively reduced from 80% to 5% as detailed in Table 1. Also shown (unshaded) for compadson are the results from Rg. 14 for fixed 100%, 5%, and 0% PR schedules. ROC = 0.1, ISl = 250 ms. (a) STM levels. (b) ALTM levels.
fast one. A subject would then rapidly adjust to each change in trial type, quickly coming to produce a strong CR when R trials are experienced, and quickly extinguishing to produce no CR when N trials are experienced. TABLE 1 Partial Reinforcement Schedule Used in Stage II of the Computer Simulation Results Shown in Figure 16 Reinforcement 80% 70% 60% 50% 40% 30% 20% 10% 5%
Trial Sequence
NRRRRNRRRR NRRRNRRNRR NRRNRRNRRN RNRNRNRNRN RNNRNNRNNR NNRNNRNNNR NNNNRNNNNR NNNNNNNNNR NNNNNNNNNN NNNNNNNNNR
R = reinforced trial, N = nonreinforced trial.
During an acquisition session of successive R trials, MTM contents tend to attain large positive values, by the same mechanism that increases M T M following a single R trial, the effects of which are simulated in Figure 9a. Furthermore, as indicated in Figure 10, the reinforcing effect of each R trial enables changes in ALTM to occur, the direction and extent of which are partially determined by MTM level. Hence, during excitatory acquisition, the large positive MTM values lead to increases in ALTM level with each successive R trial, producing an appropriately high level of associability, as illustrated in Figure 12b. During an extinction session of successive N trials, MTM contents are made to decrease and attain large negative values, by the same mechanism that results in negative MTM values from a single N trial, the effects of which are simulated in Figure 9b. However, in contrast to an acquisition session, the lack of reinforcement in an extinction session ensures that the negative MTM values are not able to reduce the value of ALTM, as illustrated in Figures 14b and 16b.
1372
Although the ALTM levels approach their adaptively regulated maximum levels in the massed excitatory acquisition sessions presented in Figure 12b, less complete increases in ALTM would be produced if each acquisition session included fewer trials, or if the ITI was greater than the 10-s ITI employed here. Under these circumstances, ALTM level would be progressively increased by each successive acquisition session in alternate acquisition-extinction session training until the maximum ALTM level was approached, and this behaviour would be consistent with empirical results (Bullock & Smith, 1953; Gonzalez et al., 1967; Davenport, 1969). A special case exists at the commencement of each acquisition session that follows an extinction session, where negative MTM values residual from prior extinction training are permitted to reduce ALTM level at the beginning of subsequent acquisition training. However, because MTM has a steady-state value of zero, and its amplitude follows an essentially inverted U-shaped profile after STM is disturbed by experience (Figure 9), this effect can be minimised by inserting an additional delay between the end of extinction sessions and the beginning of acquisition sessions. In support of this behaviour, it has been empirically demonstrated that the learning rate, as measured by the rate of extinction during extinction sessions, increases with increasing difference between the ITI within each session, and the interval between extinction sessions and acquisition sessions (Capaldi, Leonard, & Ksir, 1968). It should be noted that even when no such additional delay is employed, the extent of this decline in ALTM level will normally be less than the extent of the increase in ALTM produced in the acquisition sessions, unless the number of consecutive R trials in the acquisition session is relatively small. Consequently, a large net increase in ACE's associability will normally result from this alternating acquisition-extinction session series. The above computer simulation results were produced with the intention of revealing a PRE, but they also reveal an additional role for alternate acquisitionextinction sessions that was not originally considered. Figure 14b reveals that within ACE the ALTM level of a continuously reinforced CS may actually decline, albeit at a relatively slow rate. The absence of any such decline in ALTM in Figure 12b indicates that it does not occur until massed excitatory acquisition is well advanced, or, in other words, until the total integrated expectation of reinforcement is nearly equal to that subsequently delivered. Note that this behaviour is not necessarily at odds with the overtraining reversal effect (Mackintosh, 1974, pp. 602-607), or the overtraining extinction effect (Mackintosh, 1974, pp. 423-427), in which the subsequent learning rate may be increased by increasing the amount of acquisition training. This is because ACE exhibits an inverted U-shaped ALTM curve as massed
B. L. Rogers
excitatory acquisition is continued, with ALTM initially increasing and then slowly decreasing. Precisely when the peak ALTM value is reached, if at all, depends upon many factors such as the ISI, the ITI, the duration and intensity of the CS and the US, previous experience, and the effects of any other stimuli also presented. The mechanism responsible for the slow decline in ALTM is most easily described in terms of the behaviour of STM in a single CS input channel activated by a reinforced excitatory CS as STM acquisition nears completion. Within ACE a periodic minimum in STM level occurs just before US onset because the reinforcing effect of US presentation is more temporally constrained than the extinction effect of CS presentation (Rogers, 1991, Chap. 8). This STM behaviour also produces small negative MTM values at this time, that in turn produce small reductions in ALTM level each time reinforcement is experienced in sustained excitatory acquisition, producing the gradual decline in associability revealed in Figure 14b. A behavioural implication of this is that if a CS is continuously reinforced with extensive massed excitatory training, then the very high STM level attained by the CS with the assistance of an initially rapid increase in ALTM value will eventually tend to become locked into associative memory by a subsequently decreasing ALTM value. The massed training need not consist of a single massive unbroken session, as many brief sessions will produce a similar result, albeit less efficiently. Intuitively, when a CS-US association in STM and LTM has been consistently and extensively reinforced, it is appropriate that it attain some degree of relative permanence. If an unexpected outcome is subsequently experienced following presentation of such a previously reliable excitatory CS, then its low associability will tend to protect its level of excitatory associative strength while other new CSs acquire the expectation of the new deviation in outcome. Within ACE, alternate acquisition-extinction training sessions therefore not only help to establish an appropriately high level of associability, they are also required to maintain a high level of associability. The latter role is another prediction for which empirical results may be readily obtained, though its behavioural utility is apparent with or without such empirical support. 11. LATENT INHIBITION AND LEARNED
IRRELEVANCE If a CS is repeatedly presented to a subject without reinforcement before excitatory acquisition training, then the subsequent rate of conditioning of the CS in that subject will be retarded compared to that of other subjects that have not received such CS-alone preexposure. The effect was called "latent inhibition" by Lubow and Moore (1959), who first studied it using excitatory classical conditioning of the leg flexion CR in
New Neural Multiprocess Memory Model
sheep and goats. They referred to the effect as latent inhibition, because they reasoned that the preexposed CS acquired inhibitory properties that retarded subsequent excitatory conditioning, because inhibition opposes excitation. However, it was later demonstrated that a preexposed CS does not become inhibitory, and that subsequent inhibitory conditioning may also be significantly retarded by such CS-alone preexposure (Rescorla, 1971; Halgren, 1974; Baker & Mackintosh, 1977). Despite this it is now standard to use the somewhat misleading term latent inhibition when referring to this effect. The demonstration of latent inhibition in several different response systems and animal species has established it as a general phenomenon (Carlton & Vogel, 1967; Chacto & Lubow, 1967; Lubow et al., 1968; Anderson et al., 1969; Lubow & Siebert, 1969; Siegel, 1969a,b; Kremer, 1971 ). Latent inhibition is increasingly effective with increases in the number of nonreinforced CS presentations (Lubow, 1965; May, Tolman, & Schoenfeldt, 1967; Siegel, 1969a), and appears to have the greatest impact during the initial stages of acquisition training (Chacto & Lubow, 1967; Lubow et al., 1968; Siegel, 1969b; James, 1971 ). Ifa naive CSwas to initially produce some excitatory synaptic output, as a result ofgeneralisation from previously trained excitors or a mildly excitatory initial STM value, then repeated presentation of such an untrained CS would tend to reduce its STM level and make its MTM attain a negative value. Figure 9b indicates that such induced negative values in MTM may be sustained for considerable periods within ACE. Although this in itself would not produce any change in ALTM level, the reinforcing US presentations in subsequent excitatory acquisition training would enable the residual negative nonzero levels of MTM to decrease ALTM, and so reduce the level of associability. This process would be similar to that occurring at the transition between acquisition and extinction sessions in alternate acquisition-extinction session training, except that the extent of the decrease in ALTM would be smaller because of the relatively weak excitatory strength of the untrained CS. The extent of the latent inhibition effect supported within ACE by this process will increase with the number of CS-alone presentations, at least until extinction is virtually complete. The spontaneous recovery also supported by ACE will mean that many CS-alone presentations are required to approach complete extinction. Because the ALTM level controls the rate of both increases and decreases in STM, the latent inhibition effect supported by ACE will retard both subsequent excitatory and inhibitory acquisition. Although making a naive CS less excitatory will tend to facilitate subsequent inhibitory acquisition, this will be overshadowed by the enduring effect of a reduced ALTM level, which remains fixed throughout acquisition within ACE for both Pavlovian and differential inhibitory conditioning
1373
procedures (Rogers, 1991, Chap. 8). Subsequent excitatory acquisition is retarded both by the reduced ALTM level, and to a lesser extent by the reduced STM level of the preexposed CS. However, as both of these quantities are increased dramatically as massed excitatory acquisition is continued (e.g., Figure 12), their ability to retard acquisition is restricted primarily to the beginning of excitatory acquisition training. All of the above characteristics are consistent with the abovementioned empirical observations regarding latent inhibition (Lubow, 1965; Chacto & Lubow, 1967; May et al., 1967; Lubow et al., 1968; Siegel, 1969a,b; James, 1971; Rescorla, 1971; Halgren, 1974; Baker & Mackintosh, 1977). The inverted U behaviour of ACE's MTM over the medium term may also help account for the confusing evidence regarding short-term latent inhibition effects (Mackintosh, 1983, p. 229). As Mackintosh ( 1974, p. 40) points out, preexposure of a CS without reinforcement (i.e., latent inhibition) is neither the only, nor the most effective technique for retarding subsequent associative conditioning. A learned irrelevance procedure, in which US presentations are randomly correlated with CS presentations, is able to more effectively retard subsequent acquisition than a latent inhibition procedure (Gamzu & Williams, 1971; Kremer, 1971 ). Mackintosh (1973) confirmed this result, and showed that learned irrelevance is specific to the reinforcer used. Mackintosh (1974, p. 40) states that "animals may specifically learn that a particular CS and UCS are uncorrelated (that the CS predicts no change in the probability of the UCS), and that this learning interferes with the establishment of an association between the two during subsequent conditioning." ACE would support learned irrelevance in a very similar manner to latent inhibition. The interspersed US presentations in a learned irrelevance procedure would provide an occasional source of reinforcement, when they happen to occur shortly after a CS presentation. Such fortuitous reinforcing US presentations would enable multiple decreases in ALTM before subsequent acquisition. The effect upon ALTM would be similar to that illustrated in Figures 14b and 16b during partial reinforcement, except that the reinstating effect of nonreinforcing US presentations and the reduced excitatory STM level associated with a naive CS would produce smaller decreases in ALTM with each reinforcing US presentation. This additional means of reducing ALTM level would account for the increased effectiveness of learned irrelevance in reducing associability compared to latent inhibition. Also, the associative nature of the NMMM's ALTM is consistent with Mackintosh's ( 1974, p. 40) quotation cited above regarding learned irrelevance, as each ALTM is specific to a particular CS and a particular US. Finally, in accord with Kremer's ( 1971 ) empirical results, latent inhibition and learned irrelevance do not just alter the rate of subsequent acquisition, they also
1374
B. L. Rogers
alter the shape of its course. Figure 12a indicated that as the initial ALTM value at the beginning of acquisition was reduced, the shape of ACE's resulting acquisition curve changes from negatively accelerating to sigmoidal, in accord with Kremer's (1971) results. Thus, ACE would support both the decline in rate and the change in shape of excitatory acquisition that is subsequent to latent inhibition and learned irrelevance procedures. 12. CONCLUSION It has become accepted practice in the ANN community to use associative (or synaptic) memory as merely a repository for the efficacy of each neural input, that makes no other active contribution to learning or recalled performance. The aim of the research documented herein was to explore the extent to which a more sophisticated multistage associative memory system could appropriately modulate the effect of experience on memory, and the strength of recalled performance as time elapses since the previous trial. The result was the development of the neural multiprocess memory model (NMMM), which is a general-purpose memory system for use in complete ANN elements where it may replace the standard associative LTM normally employed in each element input. This article focuses primarily upon the NMMM as a distinct module and provides computer simulation results of its spontaneous memory retention behaviour when it is functioning in isolation. However, a demonstration of the NMMM's more elaborate learningrate modulation behaviour is also provided to illustrate the NMMM's response to an actual set of learning rules employed within a host ANN element--the associative conditioning element (ACE) developed by Rogers (1991). The NMMM incorporates synaptic STM and LTM, which interact in a complementary manner to provide rapid adjustment to new environmental contingencies in the short term, and appropriate integration of recent events with all previous experience in the long term. This is claimed here to be the utility of spontaneous regression and recovery behaviour, and is generated primarily by the way in which STM and LTM interact within the NMMM. The separation between short- and long-term experience offered by STM and LTM is further enhanced by the addition of MTM, which mediates change in LTM. These three memories interact to make STM exhibit U-shaped retention curves, and to support a transfer process from STM to LTM that is consistent with a consolidation theory of memory. When the inverted U-shaped behaviour of MTM is slightly modified by MGM, which is specially incorporated into the NMMM for this purpose, MTM becomes a highly suitable controlling signal for the modification of adaptive associability level, which directly
modulates the associative learning rate, and is retained in ALTM. Presented computer simulation results of the NMMM's operation when it is functioning as an integral component of ACE demonstrated that it supports the following adaptive behaviour. When a naive CS with a low initial level of associability is consistently correlated with reinforcement during massed excitatory acquisition, then its associability (i.e., ALTM level in the NMMM) initially increases, but if training is maintained for long enough then its associability gradually declines again. In other words, with the NMMM an extensively overtrained excitatory CS will eventually attain relative permanence. When the CS is reliably associated with reinforcement sometimes, and reliably associated with nonreinforcement at others, as in alternate massed fully reinforced and nonreinforced acquisition sessions, then its associability may be increased toward and maintained at a maximum level. This means that the NMMM adjusts its ALTM level to enable its recalled associative strength to rapidly adapt to the prevailing contingency, and quickly produce the most appropriate response in the prevailing temporal context. When a CS is inconsistently reinforced then its NMMM associability becomes (or remains) relatively low (producing the PRE), so that long-term averages may be integrated, and responding can continue through intervals of nonreinforcement. This is appropriate because previous experience indicates that some reinforcement may still be intermittently obtained, despite its recent absence. An analysis of the presented computer simulation results indirectly indicates that the new adaptive associability mechanism developed within the NMMM also supports preconditioning procedures in which prior nonreinforced CS presentations (in the case of latent inhibition) or noncontingent CS and US presentations (in the case of learned irrelevance) retard subsequent associative conditioning training between the CS and the US. Furthermore, the specific dependencies and behavioural characteristics of the reduced associability that would result from these procedures also compare favourably with empirical results. In conclusion, the NMMM is a psychological multiprocess model of associative memory that unifies a range of empirically observed memory phenomena, and is also a functioning computational system capable of contributing this additional utilitarian behaviour to the ANN element in which it is deployed. The NMMM incorporates a new specific functioning implementation of adaptive associability. The role of the NMMM's adaptive associability behaviour has been extended beyond that of supporting latent inhibition and learned irrelevance, such that it also plays a pivotal role in the selective production of sigmoidal or negatively accelerating acquisition, the PRE, accelerated learning following alternate acquisition-extinction sessions, and an
New Neural Multiprocess Memory Model
initial relatively rapid increase in learning rate during massed excitatory acquisition that is then followed by a gradual decline in learning rate if acquisition training is maintained well beyond the point at which maximum associative strength has been attained. The computer simulation results presented indicate that the NMMM, when functioning within ACE, is able to differentiate and appropriately respond to a diverse set of types of interrelationships among its inputs that exist both within each trial, and over many trials. Of particular note is the NMMM's capacity to extract from experience associative associability levels that are employed to support appropriate learning rates. The NMMM has demonstrated that this type of adaptive associability may complement and enhance the adaptive capability of the learning rules that modify the associative strength weightings in an ANN element, and makes this biologically observed behaviour available to ANNs where it may subsequently be applied to biologically relevant problems. REFERENCES Anderson, D. C., O'Farrell, T., Formica, R., & Caponigri, V. (1969). Preconditioning CS exposure: Variation in place of conditioning and of presentation. Psychonomic Science, 15, 54-55. Anderson, J. A., Silverstein, J. W., Ritz, S. A., & Jones, R. S. ( 1977 ). Distinctive features, categorical perception, and probability learning: Some applications of a neural model. PsychologicalReview, 84, 413-451. Annau, Z. & Kamin, L. J. ( 1961 ). The conditioned emotional response as a function of intensity of the US. Journal of Comparative and Physiological Psychology, 54, 428-432. Ashton, A. B., Bitgood, S. C., & Moore, J. W. (1969). Differential conditioning of the rabbit nictitating membrane response as a function of US shock intensity and duration. PsychonomicScience, 15, 127-218. Atkinson, R. C. & Shiffrin, R. M. (1968). Human memory: A proposed system and its control processes. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation (Vol. 2, pp. 89-195 ). New York: Academic Press. Atkinson, R. C. & Shiffrin, R. M. ( 1971 ). The control of short-term memory. Scientific American, 225, 82-90. Baker, A. G. & Mackintosh, N. J. ( 1977 ). Excitatory and inhibitory conditioning following uncorrelated presentations of CS and US. Animal Learning & Behavior, 5, 315-319. Barto, A. G. & Sutton, R. S. (1985). Neural problem solving. In W. B. Levy, J. A. Anderson, & S. Lehmkuhle (Eds.), Synaptic
modification, neuron selectivity, and nervous system organization (pp. 123-152). London: Lawrence Erlbaum Associates. Broadbent, D. E. (1958). Perception and communication. London: Pergamon. Broadbent, D. E. (1984). The Maltese cross: A new simplistic model for memory, Behavioral and Brain Sciences, 7, 55-94. Buonomano, D. V., Baxter, D. A., & Byrne, J. H. (1990). Small networks of empirically derived adaptive elements simulate some higher-order features of classical conditioning. Neural Networks, 3(5), 507-523. Bullock, D. H. & Smith, W. C. (1953). An effect of repeated conditioning-extinction upon operant strength. Journal of Experimental Psychology, 46, 349-352. Capaidi, E. J. (1966). Partial reinforcement: A hypothesis of sequential effects. PsychologicalReview, 73, 459-477.
1375 Capaldi, E. J. & Kassover, K. (1970). Sequence, number of nonrewards, anticipation, and intertrial interval in extinction. Journal of Experimental Psycholog£ 84, 470-476. Capaldi, E. J., Leonard, D. W., & Ksir, C. (1968). A reexamination of extinction rate in successive acquisitions and extinctions. Journal of Comparative and Physiological Psychology, 66, 128-132. Card, H. C. & Moore, W. R. (1990). Silicon models of associative learning in Aplysia. Neural Networks, 3(3), 333-346. Carew, T. J., Hawkins, R. D., Abrams, T. W., & Kandel, E. R. (1984). A test of Hebb's postulate at indentified synapses which mediate classical conditioning in Aplysia. Journal of Neuroscience,4, 12171224. Carew, T. J., Hawkins, R. D., & Kandel, E. R. (1983). Differential classical conditioning of a defensive withdrawal reflex in Aplysia californica. Science, 219, 397-400. Carew, T. J., Pinsker, H. M., & Kandel, E. R. (1972). Long term habituation of a defensive withdrawal reflex in Aplysia. Science, 175, 451-454. Carew, T. J., Waiters, E. T., & Kandel, E. R. ( 1981 ). Classical conditioning in a simple withdrawal reflex in Aplysia californica. Journal of Neuroscience, 1, 1426-1437. Carlton, P. L. & Vogel, J. R. ( 1967 ). Habituation and conditioning. Journal of Comparative and Physiological Psychology, 63, 348351. Chacto, C. & Lubow, R. E. ( 1967 ). Classical conditioning and latent inhibition in the white rat. Psychonomic Science, 9, 135-136. Chen, W. Y., Aranda, L. C., & Luco, J. V. (1970). Learning and longand short-term memory in cockroaches. Animal Behavior, 18, 725-732. Davenport, J. W. (1969), Successive acquisitions and extinctions of discrete bar-pressing in monkeys and rats. Psychonomic Science, 16, 242-244. Desmond, J. E. & Moore, J. W. (1988). Adaptive timing in neural networks: The conditioned response. Biological Cybernetics, 58, 405-415. Eccles, J. C. (1964). The physiology of synapses. Berlin: Springer. Eccles, J. C. & Mclntyre, A. K. (1953). The effects of disuse and of activity on mammalian spinal reflexes. Journal of Physiology, 121, 492-516. Ellson, D. G. (1938). Quantitative studies of the interaction of simple habits. I. Recovery from specific and generalized effects of extinction. Journal of Experimental Psychology, 23, 339-358. Estes, W. K. ( 1955 ). Statistical theory of spontaneous recovery and regression. PsychologicalReview, 62, 145-154. Gamzu, E. & Williams, D. R. (1971). Classical conditioning of a complex skeletal response. Science, 171, 923-925. Gibbs, C. M., Latham, S. B., & Gormezano, I. (1978). Classical conditioning of the rabbit nictitating membrane response: Effects of reinforcement schedule on response maintenance and resistance to extinction. Animal Learning & Behavior, 6 ( 2 ), 209-215. Gluck, M. A. & Thompson, R. E (1987). Modelling the neural substrates of associative learning and memory: A computational approach. PsychologicalReview, 94, 176-191. Gonzalez, R. C,, Holmes, N. K., & Bitterman, M. E. ( 1967 ). Asymptotic resistance to extinction in fish and rat as a function of interpolated retraining. Journal of Comparative and Physiological Psychology, 63, 342-344. Gormezano, I. (1972). Investigations of defence and reward conditioning in the rabbit. In A. H. Black & W. E Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 15 I181 ). New York: Appleton-Century-Crofts. Gormezano, I. (1984). The study of associative learning with CSCR paradigms. In D. L. Alkon & J. Farley (Eds.), Primary neural substrates of learning and behavioral change (pp. 5-24). Cambridge: Cambridge University Press. Gormezano, I., Kehoe, E. J., & Marshall, B. S. (1983). Twenty years of classical conditioning research with the rabbit. Progressin Psychobiology and Physiological Psychology, 10, 197-275.
1376 Grossberg, S. ( 1988 ). Nonlinear neural networks: principles, mechanisms, and architectures. Neural Networks, 1 ( 1), 17-61. Grossberg, S. & Schmajuk, N. A. (1989). Neural dynamics of adaptive timing and temporal discrimination during associative learning. Neural Networks, 2(2), 79-102. Grosslight, J. H. & Radlow, R. (1956), Patterning effect of the nonreinforcement-reinforcement sequence in a discrimination situation. Journal of Comparative and Physiological Psychology, 49, 542-546. Grosslight, J. H. & Radlow, R. (1957). Patterning of nonreinforcement-reinforcement sequence involving a single nonreinforced trial. Journal of Comparative and Physiological Psychology, 50, 23-25. Halgren, C. R. (1974). Latent inhibition in rats: associative or nonassociative? Journal of Comparative and Physiological Psychology, 86, 74-78. Halstead, W. C. & Rucker, W. B. (1968). Memory: A molecular maze. Psychology Today, 2( 1), 38-41, 66-67. Hawkins, R. D. (1981). Interneurons involved in mediation and modulation of gill-withdrawal reflex in Aplysia. III. Identified facilitating neurons increase Ca2+ current in sensory neurons. Journal of Neurophysiology, 45, 327-339. Hawkins, R. D., Abrams, T. W., Carew, T. J., & Kandel, E. R. ( 1983 ). A cellular mechanism of classical conditioning in Aplysia: Activitydependent amplification of presynaptic facilitation. Science, 219, 400-404. Hawkins, R. D., Castellucci, V. E, & Kandel, E. R. (1981). Interneurons involved in mediation and modulation of gill-withdrawal reflex in Aplysia. 1. Identification and characterisation. Journal of Neurophysiology, 45, 304-314. Hawkins, R. D. & Kandel, E. R. (1984). Is there a cell-biological alphabet for simple forms of learning? Psychological Review, 91, 376-391. Hull, C. L. ( 1943 ). Principles of behavior New York: Appleton-Century-Crofts. James, J. P. ( 1971 ). Latent inhibition and the preconditioning-conditioning interval. Psychonomic Science, 24, 97-98. James, W. A. (1981). The principles of psychology (Vols. 1 & 2). Cambridge MA: Harvard University Press. (Original work published 1890.) John, E. R. (1972). Switchboard versus statistical theories of learning and memory. Science, 177, 850-864. Kamin, L. J. ( 1957 ). The retention of an incompletely learned avoidance response. Journal of Comparative and Physiological Psychology. 50, 457-460. Kamin, L. J. (1963). Retention of an incompletely learned avoidance response: Some further analyses. Journal of Comparative and Physiological Psychology, 56/4, 713-718. Kandel, E. R. & Schwartz, J. H. (1982). Molecular biology of learning: Modulation of transmitter release. Science, 218, 433-443. Kehoe, E. J. (1979). The role of CS-US contiguity in classical conditioning of the rabbit's nictitating membrane response to serial stimuli. Learning and Motivation, 10, 23-38. Kehoe, E. J. (1988). A layered network model of associative learning: Learning to learn and configuration. Psychological Review, 95, 411-433. Kehoe, E. J., Gibbs, C. M., Garcia, E., & Gormezano, I. (1979). Associative transfer and stimulus selection in classical conditioning of the rabbit's nictitating membrane response to serial compound CSs. Journal of Experimental Psychology."Animal Behavior Processes, 5( 1), 1-18. Kehoe, E. J., Marshall-Goodell, B., & Gormezano, I. (1987). Differential conditioning of the rabbit's nictitating membrane response to serial compound stimuli. Journal of Experimental Psychology. Animal Behavior Processes, 13( 1 ), 17-30, Klopf, A. H. ( 1987 ). A neuronalmodel of classical conditioning (Tech. Rep. No. AFWAL-TR-87-1139). Wright-Patterson Air Force Base: Air Force Systems Command.
B. L. Rogers Konorski, J. (1948). Conditioned reflexes and neuron organization. Cambridge: Cambridge University Press. Kremer, E. E ( 1971 ). Truly random and traditional control procedures in CER conditioning in the rat. Journal of Comparative and Physiological Psychology, 76, 441-448. Le Doux, J. E. (1990). Information flow from sensation to emotion: Plasticity in the neural computation of stimulus value. In M. Gabriel & J. Moore (Eds.), Learning and computational neuroscience: Foundations of adaptive networks (pp. 3-52). Cambridge: The MIT Press. Lubow, R. E. (1965). Latent inhibition: Effects of frequency of nonreinforced preexposure of the CS. Journal of Comparative and Physiological Psychology, 60, 454-459. Lubow, R. E., Markman, R. E., & Allen, J. (1968). Latent inhibition and classical conditioning of the rabbit pinna response. Journal of Comparative and Physiological Psychology,, 66, 688-694. Lubow, R. E. & Moore, A. U. (1959). Latent inhibition: The effect of nonreinforced preexposure to the conditioned stimulus. Journal of Comparative and Physiological Psychology, 52, 415-419. Lubow, R. E. & Siebert, L. (1969). Latent inhibition within the CER paradigm. Journal of Comparative and Physiological Psychology, 68, 136-138. Mackintosh, N. J. (1973). Stimulus selection: Learning to ignore stimuli that predict no change in reinforcement. In R. A. Hinde & J. Stevenson-Hinde (Eds.), Constraints on learning (pp. 7596). London: Academic Press. Mackintosh, N. J. (1974). The psychology of animal learning. New York: Academic Press. Mackintosh, N. J. (1975). A theory of attention: Variation in the associability of stimuli with reinforcement. PsychologicalReview, 82, 276-298. Mackintosh, N. J. (1983). Conditioningand associative learning. New York: Oxford University Press. Mackintosh, N. J., & Little, L. (1970). Effects of different patterns of reinforcement on performance under massed or spaced extinction. Psychonomic Science, 20, 1-2. May, R. B., Tolman, C. W., & Schoenfeldt, M. G. (1967). Effects of pre-training exposure to the CS on conditioned suppression. Psychonomic Science, 9, 61-62. McGaugh, J. L. (1966). Time-dependent processes in memory storage. Science, 153, 1351-1358. McGaugh, J. L. (1969). Facilitation of memory storage processes. In S. Bogoch (Ed.), The future of the brain sciences (pp. 355370). New York: Plenum Press. Menzel, R. (1984). Short-term memory in bees. In D. L. Alkon & J. Farley (Eds.), Primary neural substrates of learning and behavioral change(pp. 259-274). Cambridge: Cambridge University Press. Menzel, R., Erber, J., & Masuhr, T. (1974). Learning and memory in the honeybee. In L. Barton-Browne (Ed.), Experimental analysis of insect behavior (pp. 195-217). New York: Springer-Verlag. Mercer, A. & Menzel, R. (1982). The effects of biogenic amines on conditioned and unconditioned responses to olfactory stimuli in the honey bee. Apis mellifera. Journal of Comparative Physiology 145, 363-368. Meredith, L. A. & Schneiderman, N. (1967). Heart rate and nictitating membrane classical discrimination conditioning in rabbits under delay versus trace procedures. Psychonomic Science, 9, 139-140. Messenger, J. B. ( 1971 ). Two stage recovery of a response in Sepia. Nature, 232, 202-203. Miller, G. A. (1956). The magical number seven plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63, 81-97. Mote, F. A., Jr. & Finger, E W. (1943). The retention of a simple running response aRer varying amounts of reinforcement. Journal of Experimental Psychology, 33, 317-322. Muller, G. E. & Pilzecker, A. (1900). Experimentelle beitrage zur
New Neural Multiprocess Memory Model lehre von gedachtnis. Zeitschrift fur Psychologie, (Supplement No. i). Pavlov, I. P. (1927). Conditioned reflexes. (G. V. Anrep, trans.) London: Oxford University Press. Pearce, J. M. & Hall, G. (1980). A model for Pavlovian learning: Variations in the effectiveness of conditioned but not of unconditioned stimuli. PsychologicalReview, 87 (6), 532-552. Pinsker, H. M., Kupfermann, I., Castellucei, V.., & Kandel, E. R. (1970). Habituation and dishabituation of the giU-withdrawal reflex in Aplysia. Science, 167, 1740-1742. Rescofla, R. A. (1971). Summation and retardation tests of latent inhibition. Journal of Comparative and PhysiologicalPsychology,, 75, 77-81. Rescorla, R. A. (1979). Conditioned inhibition and extinction. In A. Dickinson & R. A. Boakes (Eds.), Mechanisms of learning and motivation: A memorial volume to Jerzy Konorski (pp. 83-110). Hillsdale, N J: Lawrence Erlbaum Associates. Rescoda, R. A. & Wagner, A. R. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black & W. E Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 64-99). New York: Appleton-Century-Crofts. Riege, W. H. & Cherkin, A. ( 1971 ). One trial learning and biphasic time course of performance in the goldfish. Science, 172, 966968. Rogers, B. L. ( 1991 ). A new associative conditioning elementfor consummatory classical conditioning. Unpublished PhD thesis, Swinburne Institute of Technology. Rogers, B. L. (1993a). New conditioned stimulus trace circuit for the rabbit's nictitating membrane response. Neural Networks, 6(6 ), 753-769. Rogers, B. L. (1993b). The associative conditioning element. In P. Leong & M. Jabri (Eds.), Proceedings of the Fourth Australian Conference on Neural Networks (pp. 45--48). Sydney: Sydney University Electrical Engineering. Sanders, G. D. & Barlow, J. J. ( 1971 ). Variation in retention performance during long term memory formation. Nature, 232, 203204. Schneiderman, N. (1966). Interstimulus interval function of the nictitating membrane response in the rabbit under delay versus trace conditioning. Journal of Comparative and Physiological Psychology, 62, 397-402. Schneiderman, N. (1972). Response system divergencies in aversive classical conditioning. In A. H. Black & W. E Prokasy (Eds.), Classical conditioning II: Current research and theory (pp. 341376). New York: Appleton-Century--Crofts. Schneiderman, N., Fuentes, I., & Gormezano, 1. (1962). Acquisition and extinction of the classically conditioned eyelid response in the Albino Rabbit. Science, 136, 650-652. Schneiderman, N. & Gormezano, I. (1964). Conditioning of the nictitating membrane of the rabbit as a function of CS-US interval. Journal of Comparative and Physiological Psychology, 57, 188195. Selverston, A. I. (1988). A consideration of invertebrate central pattern generators as computational data bases. Neural Networks, 1 (2), 109-117. Sheffield, V. E (1949). Extinction as a function of partial reinforcement and distribution of practice. Journal of Experimental Psychology, 39, 511-526. Siegel, S. (1969a). Effect of CS habituation on eyelid conditioning. Journal of Comparative and Physiological Psychology, 68, 245248. Siegel, S. (1969b). Generalization of latent inhibition. Journal of Comparative and Physiological Psychology, 69, 157-159. Skinner, B. E (1938). The behavior of organisms. New York: Appleton--Century-Crofts. Skinner, B. E (1950). Are theories of learning necessary? Psychological Review, 57, 193-216.
1377 Smith, M. C. (1968). CS--US interval and US intensity in classical conditioning of the rabbit's nictitating membrane response. Journal of Comparative and Physiological Psychology, 66, 679-687. Smith, M. C., Coleman, S. R., & Gormezano, I. (1969). Classical conditioning of the rabbit's nictitating membrane response at backward, simultaneous, and forward CS-US intervals. Journal of Comparative and Physiological Psychology, 69, 226-231. Spear, N. E, Hill, W. E, & O'Sullivan, D. J. (1965). Acquisition and extinction after initial trials without reward. Journal of Experimental Psychology, 69, 25-29. Spence, K. W. (1956). Behavior theory and conditioning. New Haven: Yale University Press. Spence, K. W. & Taylor, J. ( 1951 ). Anxiety and strength of the UCS as determiners of the amount of eyelid conditioning. Journal of Experimental Psychology,, 42, 183-188. Spivey, J. E. & Hess, D. T. (1968). Effect of partial reinforcement trial sequences on extinction performance. PsychonomicScience, 10, 375-376. Sutton, R. S. & Barto, A. G. (1981). Toward a modern theory of adaptive networks: Expectation and prediction. PsychologicalReview, 88, 135-170. Ungar, G. (1970). Role of proteins and peptides in learning and memory. In G. Ungar (Ed.), Molecular mechanisms in memory and learning (pp. 149-175 ). New York: Plenum Press. Vandercar, D. H. & Schneiderman, N. (1967). Interstimulus interval functions in different response systems during classical discrimination conditioning of rabbits. PsychonomicScience, 9, 9-10. Wagner, A. R. (1978). Expectancies and the priming of STM. In S. H. Hulse, H. Fowler, & W. K. Honig (Eds.), Cognitiveprocesses in animal behavior (pp. 177-209). Hillsdale, NJ: Eribaum. Waiters, E. T. & Byrne, J. H. (1983). Associative conditioning of single sensory neurons suggests a cellular mechanism for learning. Science, 219, 404--407. Waugh, N. C. & Norman, D. A. (1965). Primary memory. Psychological Review, 72, 89-104. Weiskrantz, L. (1970). A long-term view of short-term memory in psychology. In G. Horn & R. A. Hinde (Eds.), Short-term changes in neural activity and behavior ( pp. 63-74). London: Cambridge University Press. Young, J. Z. (1970). Short and long memories in Octopus and the influence of the vertical lobe system. Journal of Experimental Biology, 52, 385-393.
APPENDIX: A S S O C I A T I V E ELEMENT
CONDITIONING EQUATIONS
The following equations are reproduced with permission from Rogers ( 1991 ), in which they are progressively derived and discussed in detail. [Note that the term USOM here replaces the term RSTM in Rogers (1991).]
Signals: CSTC equations: ROC = rate of change input (preset). CS(n) = conditioned stimulus input to CS input channel n at time t. CST(n) = C(n)D(n); CS trace signal.
(A. 1)
NMMM equations: EXC(n) = [STM(n)] + CST(n); excitatory CS input channel output.
(A.2)
INH(n) = [-STM(n)] + CST(n); inhibitory CS input channel output.
(A.3)
B. L. Rogers
1378 CCST(n) = CST(n)/(CST(n) + CCSTreg);
d
compressed CST(n) signal.
MGM(n) = MGMacc[RGM(n) - MGM(n)] +
(A.4) - MG Mde p M G M ( n )
EAC(n) = INC CST(n) A L T M ( n ) ; enable assoeiability change signal.
(A.5)
d STM(n)
Output stage equations:
STMacc INC C C S T(n) A L T M ( n ) R G M ( n )
- STMdep O U T A L T M ( n ) + S T M c h g ( L T M ( n ) - S T M ( n ) )
US = unconditioned stimulus input to the output stage at time t. OUT:
~ EXC(n)-
_ INH(n)
n=l
n=l
;
-
USOM]+; incremental feedback signal.
DEC = OUT; decremental feedback signal. C u m u l a t i v e
+ S T M s e n [ L T M ( n ) - S T M ( n ) ] + INC
(A.6)
MTMacc[STM(n) - LTM(n) - MTM(n)] + MGM(n)
(A.7)
-
MTMdep[MTM(n) + LTM(n) - STM(n)] + MGM(n)
(A.8)
-
d
Q u a n t i t i e s :
-
A a c c [ C S ( n ) - A ( n ) ] ÷ ROC CS(n)
-
Adep[A(n) - CS(n)] ÷ ROC
LTMdep[-MTM(n)] ÷ MGM(n)
(A.9)
Bdep[B(n) - [CS(n) - A ( n ) ] + ] + (A.10) (A.11)
d dt
- - USOM = USOMacc(US - USOM) USOMdep USOM
(A.12) where:
N M M M equations.
[x] ÷ = x ,
ifx~0,
d
Ix]+ ~ 0,
i f x < 0.
RGM(n) = RGMacc[CCST(n) - RGM(n)] +
and
(the number of CS input channels N = 4.) R G M d e p [ R G M ( n ) - CCST(n)] ÷
STMacc = 0.032
(A.13)
and:
MTMacc = 0.0002
LTMacc = 0.0002
ALTMacc = 2.0
STMdep = 0.002
MTMdep = 0.0001
LTMdep = 0.0001
ALTMdep = 2.0
STMchg = 0.0001
MTMchg = 0.000001
ALTMreg = 0.5
STMsen = 0.5 RGMacc = 0,2
MGMacc = 0.5
RGMdep = 1.0
MGMdep = 0.00005
RSTMdep = 0.01
Aacc = 0.5
Bacc = 0.020
Cchg = 2.0
Adep = 0.5
Bdep = 0.005
Dchg = 0.4
0.01 -_< ROC ~ 0.50 typically.
(A.18)
Output Stage equations:
-
D(n) = Dchg(C(n) - D ( n ) ) R O C
-
(A.17)
A L T M a c c [ M T M ( n ) ] ÷ (ALTMreg - E A C ( n ) ) EA C ( n ) - A L T M d e p [ - M T M ( n ) ] + EA C (n )
d c(n)=Cchg(B(n)-O(n)+[D(n)-C(n)]+)ROC dt d
(A.16)
d ALTM(n) =
d B(n) = Bacc[[CS(n) - A ( n ) ] + - B ( n ) ] + -
MTMchg M T M ( n )
LTM(n) = LTMacc[MTM(n)] + MGM(n)
CSTC equations:
~tA(n)
(A.15)
d MTM(n) =
CS expects US output signal. INC = [US
(A.14)
RSTMacc = 0.30
CCSTreg = 0.2
(A.19)