External activity and the freedom to recode

External activity and the freedom to recode

ARTICLE IN PRESS Neurocomputing 69 (2006) 1233–1237 www.elsevier.com/locate/neucom External activity and the freedom to recode William B Levy, Xian...

162KB Sizes 2 Downloads 48 Views

ARTICLE IN PRESS

Neurocomputing 69 (2006) 1233–1237 www.elsevier.com/locate/neucom

External activity and the freedom to recode William B Levy, Xiangbao Wu University of Virginia Health System, P.O. Box 800420, Neurosurgery, Charlottesville, VA 22908-0420, USA Available online 7 February 2006

Abstract Our hippocampal model depends on randomization. In principle, randomizations, e.g., chaotic activity fluctuations, quantal synaptic failures, or initial state randomization, can be overcome by strong external excitation. However, if external activity is too low, randomization will destroy the information transmitted by the inputs. Here, computer simulations of the transitive inference paradigm reveal an optimal range of external excitation. At lower activity levels, optimal performance occurs when the relative external excitation accounts for 35–40% of the total with activity, while at higher activity, external activity can be as low as 30% of the total. r 2006 Elsevier B.V. All rights reserved. Keywords: Hippocampus; Learning; Transitive inference

1. Introduction Our computational theory of hippocampal function [4,5,6] centers on random recoding and sequence prediction. The random recoding process arises from random connectivity and certain random or pseudorandom processes. With the help of associative synaptic modification, this model creates context codes [5] which are also suitable for sequence prediction. At the heart of the model and the recoding that it produces is the sparse, excitatory recurrent connectivity of region CA3. Without any input, even the deterministic version of the model is random, in the sense that activity shifts chaotically [11,17]. This chaotic tendency can be reduced; equivalently, the correlation between successive states of CA3 firing can be controlled by the amount of external excitation of CA3. Thus, if the external input is strong enough, it can overcome the chaotic randomization which the sparse recurrence tends to produce. On the other hand, if external activation is too low, there will be nothing to learn. The external excitation in the model corresponds to the layer II cells of the entorhinal cortex that excite CA3 neurons directly or through the dentate gyrus. Presumably, convergence between these two inputs produces a small number of strongly activated CA3 pyramidal cells. Because Corresponding author. Tel.: +1 434 9249996; fax: +1 434 9823829.

E-mail address: [email protected] (W. B Levy). 0925-2312/$ - see front matter r 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.neucom.2005.12.083

total activity is regulated, a tension exists between the number of neurons activated by recurrent connections versus the number activated by external connections. This tension between recurrent and external excitation has been quantified in abstract settings [13], but its importance in a quantified behavioral paradigm has yet to be demonstrated. Here we present the computational simulations agreeing with the earlier work. External activity in the range of 30–40% of total activity is optimal and produces the best all around cognitive performance for the hippocampal dependent task of transitive inference (TI). TI is a simple, linear, logical inference. In this task training is based on five stimuli that are presented as four distinct pairings and correct answers {A4B, B4C, C4D, D4E} where the inequality symbol implies the correct choice of each premise pair used in training. After a certain period of training, the test pairing for transitivity is the question B ? D, with B4D as the correct answer. That is, the simplest relationship is the linear one, A4B4C4 D4E, and this is the relationship that most rats and most people infer.

2. The hippocampal model The model used here is an extension of a hippocampal model of region CA3 (see e.g., [4,5]). The input layer corresponds to a combination of the entorhinal cortex and

ARTICLE IN PRESS W. B Levy, X. Wu / Neurocomputing 69 (2006) 1233–1237

1234

dentate gyrus. The CA3 model is a sparsely (10%) interconnected feedback network of 1024 neurons or 8192 neurons where all direct, recurrent connections are excitatory and the neurons are McCulloch–Pitts with binary f0; 1g output. There is an interneuron mediating feedforward inhibition, and another one mediating feedback inhibition. Inhibition is of the divisive form, but activity is only imperfectly controlled because of a feedback delay that expresses the inhibition across the primary neurons [11,17]. Also implemented is a simple extrahippocampal decision function (see [21] decision function I). Presumably, after appropriate decoding by CA1, the subiculum, and the entorhinal cortex, a transformed version of the CA3 output reaches a prefrontal or pre-motor region that is actually capable of making and implementing a decision based on a linear decoder. Fig. 1 is a schematic diagram for the model of the CA3 region of the hippocampus, along with an extrahippocampal decision function. The entorhinal cortex and dentate gyrus (EC/DG) are combined, providing the sole input into the CA3 region. The CA3 region is modeled as a sparse, randomly connected recurrent neural network, the EC/DG is represented by a binary input vector, and the decision function is carried out explicitly. Primary neurons are all excitatory, projecting sparsely to each other and in an allto-one manner to excite an inhibitory neuron that feeds back to these excitatory pyramidal cells. Each excitatory pyramidal neuron randomly connects to approximately 10% of the other neurons. Given that the output of neuron i is zi ðtÞ, the net excitation of neuron j is called yj ðtÞ, and it is given by

Fig. 1. (a) CA3 model with decision function (see [21]). The external input to CA3 is sparse and excitatory representing a combination of the inputs from entorhinal cortex (EC) and dentate gyrus (DG). The strongest input to the network is its own recurrent excitation. Both the external and recurrent inputs are accompanied by proportional activation of inhibitory neurons (I). The output of the network is the firing vector of the CA3 neurons themselves. (b) Schematic depiction of the sparse (10%) recurrent excitatory connectivity of the CA3 model.

approximate the low firing rate of the hippocampus (see [19]). The synaptic failure channel of the connection from neuron i to neuron j is represented by the function f, where fi ðzj ¼ 0Þ ¼ 0 and a synaptic failure fi ðzj ¼ 1Þ ¼ 0, occurs with probability f, and fi ðzj ¼ 1Þ ¼ 1 with probability

Pn w c fðzi ðt  1ÞÞ i¼1 Pn ij ij Pn , yj ðtÞ ¼ Pn i¼1 wij cij fðzi ðt  1ÞÞ þ K FB ð i¼1 Di ðt  1Þzi ðt  1ÞÞ þ K 0 þ K FF i¼1 xi ðtÞ

(1)

where wij represents the weight value from neurons i and j, and cij is the binary f0; 1g variable indicating connection Pn from neuron i to j. The term i¼1 wij cij fðzi ðt  1ÞÞ represents the excitatory synaptic conductance for the jth neuron. Parameters K FB and K FF are constants used to control feedback and feedforward inhibitions, respectively P ( X i ðtÞ is the total input excitation to the network). K0 is a constant that controls the magnitude and stability of activity oscillations, which can be considered the rest conductance in a shunting model [17]. A binary external input to neuron j at time t is indicated by xj ðtÞ. A postsynaptic neuron fires

ð1  f Þ. The failure process is a Bernoulli random variable, which acts independently on each synapse at each timestep. The model uses a biologically inspired postsynaptic associative modification rule with time staggering between pre and postsynaptic activity [4,8,9]. For more biological simulations, synaptic modification spans multiple timesteps, approximating NMDA-dependent LTP and LTD [2,7,14], i.e.

zj ðtÞ ¼ 1

z¯ i ðtÞ ¼

if either xj ðtÞ ¼ 1

or

if yj ðtÞXy,

where threshold y is fixed at 0.5. Synaptic failures are included in the model, which makes the simulations more robust and better able to

wij ðt þ 1Þ ¼ wij ðtÞ þ mzj ðtÞðð¯zi ðt  1ÞÞ  wij ðtÞÞ,

(2)

where (

z¯ i ðt  1Þa 1

if fðzi ðtÞÞ ¼ 0; if fðzi ðtÞÞ ¼ 1

(3)

i is input and j is output, m is the synaptic modification rate, and a represents the exponential decay of the

ARTICLE IN PRESS W. B Levy, X. Wu / Neurocomputing 69 (2006) 1233–1237

glutamate-primed NMDA receptor in one time constant (e.g., 100 ms; see [12] for details). For better control of activity, a rule for modification of interneuron afferent weights (pyramidal cell-to-interneuron, excitatory inputs) is used in larger networks (e.g. 8192 neurons, [18]). Di ðt þ 1Þ ¼ Di ðtÞ þ lzi ðtÞ½ðmðtÞ=nÞ  a,

1235

externally activated. In the case of the premise pairs AB, BC, and CA, 30 external neurons represent each pair; that is, 15 external neurons are devoted to A, B, and C, individually. Note that re is an average value and that the actual fractional external activity fluctuates across time.

(4)

where Di is the weight of excitatory connection from neuron i to the feedback interneuron, l is the pyramidalinterneuron synaptic modification rate constant, mðtÞ is the number of active neurons at time t, and a is the desired activity fraction. 3. Training with initial randomization A generic training trial consists of the sequence stimulus (e.g., AB), decision (e.g., a), and outcome (e.g., +). To train a network on TI, eight training sequences are created: two for each of the four subtasks corresponding to the correct and incorrect choices. The eight training sequences for TI are For subtask 1, (AB)(AB)(AB)(AB)(AB)(AB)(AB)aaaaaaa þ þ þ þ þ þ þ, or (AB)(AB)(AB)(AB)(AB)(AB)(AB)bbbbbbb       ; For subtask 2, (BC)(BC)(BC)(BC)(BC)(BC)(BC)bbbbbbb þ þ þ þ þ þ þ, or (BC)(BC)(BC)(BC)(BC)(BC)(BC)ccccccc       ; For subtask 3, (CD)(CD)(CD)(CD)(CD)(CD)(CD)ccccccc þ þ þ þ þ þ þ, or (CD)(CD)(CD)(CD)(CD)(CD)(CD)ddddddd       ; For subtask 4, (DE)(DE)(DE)(DE)(DE)(DE)(DE) dddddddþ þ þ þ þ þ þ, or (DE)(DE)(DE)(DE)(DE)(DE)(DE) eeeeeee      . That is, each input pattern is repeated for seven timesteps so that each trial is 21 timesteps long, plus the starting state Z(0), which is fully randomized (see e.g., [16,20]). The actual blocking of the training paradigm uses Alvarado and Rudy’s [1] progressive paradigm rather than the Dusek and Eichenbaum [3] staged paradigm that characterizes our earlier work (see e.g., [17,15] for comparison of staged and progressive learning). 4. External activity The variable investigated here is the relative P strength of the activated inputs, re.PDefining me ¼ i X i ðtÞ and average total activity as E½ i Z i ðtÞ, the variable of interest P is re ¼ me =E½ i Zi ðtÞ P which can vary from zero to one. For example, if E½ i Z i ðtÞ ¼ 100 neurons per timestep, P then me =E½ i Z i ðtÞ ¼ 30%. That is, when an input pattern is +, , a, b, c, d, or e, then 30 distinct neurons are

5. Results The results of varying external activation are pictured in Fig. 2. As intuition suggested, external activation cannot be too weak or too strong. It must account for less than 50% of the firings to produce good performance on all parts of the TI task, while less than 30% external activation does not produce the desired performance. However, the range of external activation optimizing performance can be widened with larger network sizes, which allow lower activity levels (see [19] for a similar observation). As Fig. 2 shows, for simulations running at 7% activity with 8192 neurons, the optimal ratio of external activity to average total activity per timestep is extended; specifically, a range of re from 30–50% produces good performance. Fig. 3 shows more details of performance for selected values of re. Note that at either low or high re, the simulations fail to learn either the full set of premise pairs

Fig. 2. Optimizing performance as a function of external activity. The optimal ratio of external activity to average total activity per timestep is approximately one-third for two distinct sets of simulations (n ¼ 1024 or 8192). The task is transitive inference (TI). The fraction of simulations that learn is plotted as a function of fractional external input re. By adjusting K FB and K 0 as re is changed, the activity level is kept approximately constant. There were two cases: a ¼ 10% for n ¼ 1024, and a ¼ 7% for networks with n ¼ 8192 neurons. Other network parameters were fixed: for n ¼ 1024, f ¼ 0%; a ¼ 10%; y ¼ 0:5; m ¼ 0:05; c ¼ 10%; K FF ¼ 0:018; w0 ¼ 0:45; a ¼ 0:7165; and l ¼ 0:5. For n ¼ 8192, f ¼ 50%; a ¼ 7%; y ¼ 0:5; m ¼ 0:05; c ¼ 10%; K FF ¼ 0:006; w0 ¼ 0:45; a ¼ 0:8669; and l ¼ 0:5. For each simulation, good performance is defined as 80% or more correct responses at the end of training for all comparisons AB, BC, CD, DE, BD, and AE. Twenty networks were trained and tested for each data point.

ARTICLE IN PRESS 1236

W. B Levy, X. Wu / Neurocomputing 69 (2006) 1233–1237

References

Fig. 3. Details of performance for selected re and n ¼ 8192. The data in this figure were generated as in Fig. 2. When re is 15%, premise pair learning is unacceptably low for DE, BD and AE. At large values of re, 60% and 70%, BD performance is unacceptable and getting worse as re increases. When re is 35% premise pair learning reaches criterion and so does the BD transitivity test. Each histogram bar is an average from 20 different simulations.

or to solve the transitivity test (B ? D). However, with re values around 35% the TI problem is satisfactorily learned.

6. Conclusion In conclusion, these results are consistent with the idea that an externally biased, randomly driven recoding is an appropriate concept for understanding hippocampal recoding. That is, there is a limited range of external excitation for good performance and, therefore, for good recoding. Because lower values of re allow higher sequence length memory capacity, we are tempted to posit that nature uses re near the 35% value. However, it is the simplicity of the model that allows such a precise prediction. In actuality, the relative strength of the EC and DG inputs, combined with anatomical observations [10], implies that re decreases across CA3 as one proceeds from CA3c to CA3a and then to CA2. Thus, an re of 35% here must be interpreted as some kind of average over the CA3 anatomy.

Acknowledgment This work was supported by NIH MH63855 to WBL; and NSF NGSEIA-9974968, NPACI ASC-96-10920, and NGS ACI-0203960 to Dr. Marty Humphrey.

[1] M.C. Alvarado, J.W. Rudy, Some properties of configural learning: an investigation of the transverse-patterning problem, J. Exp. Psychol. 18 (1992) 145–153. [2] D.A. August, W. B Levy, Temporal sequence compression by an integrate-and-fire model of hippocampal area CA3, J. Comp. Neurosci. 6 (1999) 71–90. [3] J.A. Dusek, H. Eichenbaum, The hippocampus and memory for orderly stimulus relations, Proc. Natl. Acad. Sci. USA 94 (1997) 7109–7114. [4] W. B Levy, A computational approach to hippocampal function, in: R.D. Hawkins, G.H. Bower (Eds.), Computational Models of Learning in Simple Neural Systems, Academic Press, New York, 1989, pp. 243–305. [5] W. B Levy, A sequence predicting CA3 is a flexible associator that learns and uses context to solve hippocampal-like tasks, Hippocampus 6 (1996) 579–590. [6] W. B Levy, A.B. Hocking, X.B. Wu, Interpreting hippocampal function as recoding and forecasting, Neural Networks 18 (2005) 1242–1264. [7] W. B Levy, P.B. Sederberg, A neural network model of hippocampally mediated trace conditioning, IEEE Int. Conf. Neural Networks (1997) I-372–I-376. [8] W. B Levy, O. Steward, Synapses as associative memory elements in the hippocampal formation, Brain Res. 175 (1979) 233–245. [9] W. B Levy, O. Steward, Temporal contiguity requirements for longterm associative potentiation/depression in the hippocampus, Neuroscience 8 (1983) 791–797. [10] R. Lorente de No´, Studies of the structure of the cerebral cortex. II. Continuation of the study of the ammonic system,, J. Psychol. Neurol. 46 (1934) 113–177. [11] A.A. Minai, W. B Levy, Setting the activity level in sparse random networks, Neural Comput. 6 (1994) 85–99. [12] K.E. Mitman, P.A. Laurent, W. B Levy, Defining in a minimal hippocampal CA3 model by matching time-span of associative synaptic modification and input pattern duration, Proceedings of the International Joint Conference on Neural Networks (IJCNN), 2003, pp. 1631–1636. [13] S. Polyn, X.B. Wu, W. B Levy, Entorhinal/dentate excitation of CA3: a critical variable in hippocampal models, Neurocomputing 32–33 (2000) 493–499. [14] P. Rodriguez, W. B Levy, A model of hippocampal activity in trace conditioning: where’s the trace? Behav. Neurosci. 115 (2001) 1224–1238. [15] A.P. Shon, X.B. Wu, W. B Levy, Using computational simulations to discover optimal training paradigms, Neurocomputing 32–33 (2000) 995–1002. [16] A.P. Shon, X.B. Wu, D. Sullivan, W. B Levy, Initial state randomness improves sequence learning in a model hippocampal network, Phys. Rev. E 65 (2002) 031914-1–031914-15. [17] A.C. Smith, X.B. Wu, W. B Levy, Controlling activity fluctuations in large, sparsely connected random networks, Network 11 (2000) 63–81. [18] D.W. Sullivan, W. B Levy, Synaptic modification of interneuron afferents in a hippocampal CA3 model prevents activity oscillations, Proceedings of the International Joint Conference on Neural Networks (IJCNN), 2003, pp. 1625–1630. [19] D.W. Sullivan, W. B Levy, Quantal synaptic failures enhance performance in a minimal hippocampal model, Network 15 (2004) 45–67. [20] X.B. Wu, W. B Levy, Enhancing the performance of a hippocampal model by increasing variability early in learning, Neurocomputing 26–27 (1999) 601–607. [21] X.B. Wu, W. B Levy, Decision functions that can support a hippocampal model, Neurocomputing, this issue, doi:10.1016/ j.neucom.2005.12.084.

ARTICLE IN PRESS W. B Levy, X. Wu / Neurocomputing 69 (2006) 1233–1237 William B Levy earned a B.A. in Psychology from Princeton and a Ph.D. in Psychobiology from the University of California Irvine. He was a Psychology professor at the University of California Riverside from 1974 until 1979 at which point he joined the faculty at the University of Virginia, where he is currently a professor in the Neurological Surgery department and in the Psychology department.

1237

Xiangbao Wu, Ph.D. is a scientist in the field of Computational Neuroscience. He is currently living in Charlottesville and working in the Neurological Surgery department at the University of Virginia.