An analysis of nonlinear dynamics underlying neural activity related to auditory induction in the rat auditory cortex

An analysis of nonlinear dynamics underlying neural activity related to auditory induction in the rat auditory cortex

Neuroscience 318 (2016) 58–83 AN ANALYSIS OF NONLINEAR DYNAMICS UNDERLYING NEURAL ACTIVITY RELATED TO AUDITORY INDUCTION IN THE RAT AUDITORY CORTEX M...

4MB Sizes 1 Downloads 28 Views

Neuroscience 318 (2016) 58–83

AN ANALYSIS OF NONLINEAR DYNAMICS UNDERLYING NEURAL ACTIVITY RELATED TO AUDITORY INDUCTION IN THE RAT AUDITORY CORTEX M. NOTO, a J. NISHIKAWA a AND T. TATENO a,b*

restored the second responses from the suppression caused by BN. To phenomenologically mimic the neural population activity in the A1 and thus investigate the mechanisms underlying auditory induction, we constructed a computational model from the periphery through the AC, including a nonlinear dynamical system. The computational model successively reproduced some of the above-mentioned experimental results. Therefore, our results suggest that a nonlinear, self-exciting system is a key element for qualitatively reproducing A1 population activity and to understand the underlying mechanisms. Ó 2016 IBRO. Published by Elsevier Ltd. All rights reserved.

a

Bioengineering and Bioinformatics, Graduate School of Information Science and Technology, Hokkaido University, Kita 14, Nishi 9, Kita-ku, Sapporo 060-0814, Japan b Special Research Promotion Group, Graduate School of Frontier Biosciences, Osaka University, 1-3 Yamadaoka, Suita, Osaka, Japan

Abstract—A sound interrupted by silence is perceived as discontinuous. However, when high-intensity noise is inserted during the silence, the missing sound may be perceptually restored and be heard as uninterrupted. This illusory phenomenon is called auditory induction. Recent electrophysiological studies have revealed that auditory induction is associated with the primary auditory cortex (A1). Although experimental evidence has been accumulating, the neural mechanisms underlying auditory induction in A1 neurons are poorly understood. To elucidate this, we used both experimental and computational approaches. First, using an optical imaging method, we characterized population responses across auditory cortical fields to sound and identified five subfields in rats. Next, we examined neural population activity related to auditory induction with high temporal and spatial resolution in the rat auditory cortex (AC), including the A1 and several other AC subfields. Our imaging results showed that tone-burst stimuli interrupted by a silent gap elicited early phasic responses to the first tone and similar or smaller responses to the second tone following the gap. In contrast, tone stimuli interrupted by broadband noise (BN), considered to cause auditory induction, considerably suppressed or eliminated responses to the tone following the noise. Additionally, tone-burst stimuli that were interrupted by notched noise centered at the tone frequency, which is considered to decrease the strength of auditory induction, partially

Key words: auditory subfields, computational model, nonlinear dynamics, optical imaging, synaptic depression.

INTRODUCTION In sound perception, sensitivity to expected signals may be enhanced under certain noisy circumstances, and robustness against background noise can be ensured. For example, a sound is perceived as interrupted or discontinuous when part of the sound is replaced by a gap of silence. However, when the gap is replaced by a noise burst with a louder sound level, the interrupted sound is heard as continuous throughout the noise. This illusory phenomenon is known as auditory (or temporal) induction (Warren et al., 1972; Bashford and Warren, 1987), and it illustrates one example of the constructive nature of sound perception. Furthermore, recent electrophysiological studies have revealed that the primary auditory cortex (A1) is related to auditory induction and the neural correlates of auditory induction have been studied for several classes of sounds in many species, such as birds (Braaten and Leary, 1999), gerbils (Kobayasi et al., 2012), guinea pigs (Kubota et al., 2012), cats (Sugita, 1997), monkeys (Miller et al., 2001; Petkov et al., 2003, 2007), and humans (Repp and Lin, 1991; Warren and Bashford, 1999; Micheyl et al., 2003; Riecke et al., 2007). To the best of our knowledge, no such studies have been conducted in rats, one of most commonly used organisms. Although experimental evidence has been accumulating, the neural mechanisms underlying auditory induction in A1 neurons are poorly understood. In the central auditory pathway, the responses of neurons to complex natural sound are not necessarily predictable from simple sound stimuli such as sound

*Correspondence to: T. Tateno, Bioengineering and Bioinformatics, Graduate School of Information Science and Technology, Hokkaido University, Kita 14, Nishi 9, Kita-ku, Sapporo 060-0814, Japan. Tel: +81-11-706-6763. E-mail address: [email protected] (T. Tateno). Abbreviations: AAF, anterior auditory field; AC, auditory cortex; A1, primary auditory cortex; AVAF, anterior ventral auditory field; BF, best frequency; BN, broadband noise; CF, characteristic frequency; ERB, equivalent rectangular bandwidth; FHN, FitzHugh–Nagumo; GF, gammatone filter; HEPES, 4-(2-hydroxyethyl)-1-piperazineethanesulfo nic acid; LA, locally evoked activity; MGB, medial geniculate body; NA, no activation; NN, notched noise; PA, propagating activity; PAF, posterior auditory field; PBP, pure-tone burst, broadband noise, and pure-tone burst; PGP, pure-tone burst, gap, and pure-tone burst; PNP, pure-tone burst, notched noise, and pure-tone burst; PT, pure tone; VAF, ventral auditory field; vMGB, ventral nucleus of the medial geniculate body. http://dx.doi.org/10.1016/j.neuroscience.2015.12.060 0306-4522/Ó 2016 IBRO. Published by Elsevier Ltd. All rights reserved. 58

M. Noto et al. / Neuroscience 318 (2016) 58–83

impulses (or clicks), pure tones (PTs) over the audible frequency range, or broadband noise (BN), all of which are often used to identify linear dynamical systems (Calabrese et al., 2011; for a review, Theunissen and Elie, 2014). Because the auditory system is intrinsically a complex and highly nonlinear dynamical system, it is natural to suppose that nonlinear dynamical properties may determine perception and functions in the auditory system. Furthermore, nonlinear phenomena can also be exhibited by neurons in the auditory cortex (AC) in response to both simple and complex sound stimulation. For instance, a linear dynamical system using spectrotemporal receptive fields of AC neurons can poorly predict their responses to complex and/or natural sound stimuli (Calabrese et al., 2011). In addition, simple nonlinear mechanisms such as adaptation, compression, and rectification, which are often also found in the peripheral auditory pathways, are insufficient to explain the nonlinearities found in AC neurons (Christianson et al., 2008; de la Rocha et al., 2008; Sharpee et al., 2008). Therefore, to elucidate the responses to such sound stimuli, a nonlinear dynamical system is a key element for understanding signal processing and modeling responses in AC neurons (Wrigley and Brown, 2004; Hoshino, 2007; Curto et al., 2009; Sharpee, 2013). In this study, therefore, to understand the neural mechanisms underlying auditory induction in AC neurons, we used both experimental and computational approaches: i.e., (i) optical imaging of rat A1 and (ii) computational modeling of the neural pathway from the periphery to the A1. First, to characterize population responses across rat auditory cortical subfields to sound, we identified five subfields, using optical imaging methods. In understanding dynamics of A1, the subfield identification was critically important to distinguish A1 from other subfields. Next, we examined neural population activity related to auditory induction with high temporal and spatial resolution in the rat AC, including the A1 and several other AC subfields. In addition, we developed a computational A1 model that included a nonlinear dynamical system; this model, which phenomenologically mimicked the neural population activity in the A1 from the periphery through the subcortical regions, was used to investigate the mechanism underlying auditory induction. Regarding this mechanism, we focused particularly on the nonlinear dynamics of A1 population activity that were influenced by sound history-dependent thalamocortical and intracortical synaptic input and the resultant recurrent A1 activity. Our results suggested that a nonlinear, self-exciting system is important for qualitatively reproducing the A1 population activity related to auditory induction and understanding the underlying mechanism.

EXPERIMENTAL PROCEDURES All experiments were carried out in accordance with the NIH Guidelines for the Care and Use of Laboratory Animals, and with approval of the Institutional Animal Care and Use Committee of Hokkaido University.

59

Surgical procedures Eight male rats (Wistar/ST, 6–12 weeks old, 163–331 g, Japan SLC, Japan) with normal Preyer’s reflex were used for the experiments. The rats were anesthetized with a mixture of midazolam (10 mg/kg, i.p.; Astells Pharma, Japan) and xylazine (12 mg/kg, i.p.; Bayer AG, Germany) in saline as the initial dose (Inaoka et al., 2011; Tateno et al., 2013). The adequacy of anesthesia was confirmed by the absence of toe-pinch reflexes. Supplemental doses were administered every 1 h with half the initial dose (midazolam, 5 mg/kg, i.m. and xylazine, 6 mg/kg i.m.) to maintain anesthesia. Dexamethasone (0.5 mg/kg i.m.; Kobayashi Kako, Japan) was also administered to suppress cerebral edema. During the experiment, rectal temperature was maintained at 34 ± 1 °C using a heat pad. This temperature is slightly lower than normal, but it was the best condition for stable recording in our setup, as described in a previous study (Song et al., 2006). As increasing the animals’ body temperature in the experiment, we observed that the areas from which the evoked activity was initiated were not profoundly shifted, but tended to be enlarged slightly. A custom-made metal adapter was attached to the skull with dental cement, and was used to hold the animal’s head during recording. A local anesthetic (xylocaine gel; AstraZeneca K.K., Japan) was applied to all incision sites. To prevent visual interference from the excitation light used for voltage-sensitive dyes, both eyes were kept closed. After resection of the temporal muscle, a hole (approximately 7 mm in the rostrocaudal direction and approximately 5 mm in the dorsoventral direction) was drilled in the temporal bone, and the left AC was exposed (Fig. 1Aa). The dura mater was removed and the cortex was stained twice with a voltage-sensitive dye RH-1691 (1 mg/ml; Optical Imaging, Israel) in an artificial cerebrospinal fluid (ACSF) solution (135 NaCl, 5 KCl, 5 HEPES, 1.8 CaCl2, and 1 MgCl2 in mM) for 30 min (1 h in total). After staining, the cortex was covered with 2% agarose in saline and sealed with a glass coverslip to reduce pulsation. Optical recording The principles of optical recording with voltage-sensitive dyes have been described in the literature (Song et al., 2006; Kubota et al., 2008). We used an imaging system with a high-speed and high-resolution CMOS camera system (MiCAM02, Brainvision, Japan) to detect optical signals. The CMOS camera was mounted on a tandemlens fluorescence microscope (THT, Brainvision). Light from a 150-W halogen lamp (HL-151, Brainvision) was projected through an excitation filter (wavelength k = 632 ± 11 nm) and reflected by a dichroic mirror (k = 550–640 nm) to activate voltage-sensitive dye at the cortical surface. Fluorescence signals were then collected through the dichroic mirror, projected through an absorption filter (k > 665 nm), and detected with the CMOS camera. The microscope was focused at a depth of 300 lm below the cortical surface to minimize the interference from blood vessels and to concentrate on the activities in cortical layer II/III (Fig. 1Ab). Although this

60

M. Noto et al. / Neuroscience 318 (2016) 58–83

Fig. 1. (A) In (a), a surface view of a blood vessel pattern for the exposed portion of the AC. Scale bar = 1 mm. For graphs in all figures, anterior (rostral) is toward the left and dorsal is upward as illustrated by arrows. In (b), the surface pattern captured by the CMOS camera during optical imaging, covering a region of 10.5 mm2. In (c1–5), five time courses of the response to an 8-kHz tone-burst sound at five sites of the field shown in (b). The five short horizontal bars also show the stimulus timing. (B) Composite response maps of a rat AC to 1-, 2-, 4-, 8-, and 16-kHz stimulation in (a), (b), (c), (d), and (e), respectively; the superimposition of all frequency results is shown in (f). For the initial response (at 50–75 ms) after the onset of the stimulation, a threshold of 60% of the maximum response was used and superimposed on the cortical surface image (the image of the basal fluorescence level). In (f), left and right dashed ellipses indicate the AAF (red) and A1 (blue), respectively. Note that the reversal of tonotopy in the A1 and AAF and the corresponding frequency gradient are indicated by two arrows from lower frequencies to higher frequencies in the tonotopy. (C) Color-coded maps of responses to the 1–16-kHz tone-burst sound stimuli. The responses were also superimposed on the cortical image. Times after stimuli onsets are shown as numbers in ms at the top. The response amplitude ratios to the maximum response are color coded as indicated by the bar on the right of the figures. Two dashed ellipses on the leftmost panel represent the areas of the AAF (red in color) and A1 (blue). Five dashed ellipses on the rightmost panel represent all areas, including, in addition to the AAF and A1, the PAF (green), VAF (yellow), and AVAF (orange). The recording area is around 4.0  4.5 mm2. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

system can achieve the highest sampling rate (833 Hz) at 1.2 ms, we used 2.0 ms or 4.0 ms, and reduced the lamp power to decrease bleaching of the voltage-sensitive dye and to suppress photo-damage of cortical neurons. The signals from a 5.8-mm  4.8-mm area of the cortical surface were acquired by the camera with 188  160 pixels in each frame (Fig. 1Ab). Optical signals recorded in

response to particular sound stimuli were averaged over 10 trials to reduce random noise per trial. Sound stimulation Sound signals were generated digitally at a rate of 50 or 100 kHz, and processed by digital-to-analog conversion,

M. Noto et al. / Neuroscience 318 (2016) 58–83

attenuation, and amplification using the TDT-system-3 hardware (RP2.1, PA5, and SA1; Tucker-Davis Technologies, USA) and software (RPvdsEX, TuckerDavis Technologies). The auditory parameters of each sound signal in RPvdsEX were modified through Active X controlled with a custom-made Matlab program (MathWorks, MA, USA). Stimulation was delivered in the open-field through a speaker (MF1, Tucker-Davis Technologies) located 20 cm in front of the animal in a double-walled sound-proof room (Comany, Japan). The sound delivery system was calibrated using a soundlevel meter (Type 2636, Bru¨el and Kjaer, Denmark) with a 1/4-inch microphone (Type 4939-L-002, Bru¨el and Kjaer) close to an animal’s left ear. For sound stimulation, we used a PT (50-ms duration in total and 5 ms rise and fall; 1–32 kHz), BN (0.10–14 kHz), and notched (band-stopped) noise (NN; 0.10–7.0 and 9.0– 14 kHz). To compare our results with those from several previous studies (Petkov et al., 2003; Kobayasi et al., 2012; Kubota et al., 2012), the sound stimuli we used were the same as those employed by Kubota et al. (2012). Briefly, the sound levels of the PT, BN, and NN, were 70-dB SPL, 71-dB SPL, and 67-dB SPL, respectively. A silent gap, BN, or NN was inserted at 50 ms after the onset of the 8-kHz PT. In the experiments described here, relatively larger sound intensities were used, because midazolam was included in the anesthetic agent (Verbny et al., 2005). The durations of the silent gap, BN, and NN, which are respectively denoted by dS, dB, and dN, ranged from 100 to 300 ms, and the total length of the stimulus was between 200 and 400 ms according to the gap duration (Fig. 1B).

Imaging data analysis Data acquired with the CMOS camera system were processed using data analysis software (BV_Ana, Brainvision, Japan) and custom-made programs written in Matlab (MathWorks, USA). For each pixel, fractional fluorescence DF/F0 was calculated for each frame, where F0 was the basal fluorescence level in the prestimulus period, DF was F  F0, and F was the current fluorescence level. This fractional fluorescence enabled us to compensate for irregularities in staining. Membrane depolarization, indicated by a decrease in fluorescence in the case of RH-1691, was represented as a positive value of the fractional fluorescence DF/F0. Typical average waveforms in response to a PT burst are shown in Fig. 1Ac1–5. The response latencies and durations in response to the sound stimulation were analyzed among the multiple auditory fields. The response latency was defined as the time between stimulus onset and the time at which the optical signal first reached 60% of its peak value. The response duration was the amount of time the signal remained above this 60% criterion. Response properties of the silent gap, BN, or NN between two PT bursts are characterized by the response amplitude ratio of R2/R1, where R1 and R2 are the maximum response amplitudes in response to the first and second PT bursts, respectively.

61

A computational model to mimic population activity in A1 To phenomenologically mimic auditory (or temporal) induction observed in the rat A1, we used the serial combination of three simple models: (i) a linear dynamical model for the periphery, (ii) a compressive static model to simulate subcortical functions, and (iii) a nonlinear dynamical model for the A1 (Fig. 9A). The first and second subsystems are based on the spatial pitch network model (Cohen et al., 1995; Grossberg et al., 2004). Some parts of this model were selected and used for frequency decomposition in preprocessing units of 32 channels with different central frequencies corresponding to the neural best frequencies (BFs) as functions of the periphery, thalamus, and other subcortical regions. Furthermore, the last subsystem in our model is a nonlinear dynamical system whose variables represented the neural activity of excitatory and inhibitory populations in each tonotopic column of the A1. To simulate differences in evoked responses to pure-tone burst, gap, and puretone burst (PGP), pure-tone burst, notched noise, and pure-tone burst (PNP), and pure-tone burst, broadband noise, and pure-tone burst (PBP) stimuli for each AC column, we needed to model a precise contribution of each filter bank having a realistic frequency preference in the auditory periphery, so that connecting the peripheral model to the AC population model was essential. The peripheral subsystem is a simple feedforward structure with no descendent or recurrent connections, characterized by 32 channels of gammatone filters (GFs) with the different central frequencies (Patterson et al., 1992; see Appendix A). The individual channels are labeled from 1 to 32 along the tonotopic axis in increasing order of frequency. The action of GFs is to decompose multi-frequency signals into a spatial output organized according to frequency; i.e., tonotopic mapping. The bandwidth of each GF was assigned on the basis of an equivalent rectangular bandwidth (ERB) scale (Moore and Glasberg, 1983; see also Appendix A). Next, the parallel output of all the channels feeds into a subcortical subsystem model that included the ventral nucleus of the medial geniculate body (vMGB) (Fig. 9A). The subcortical subsystem is assumed to calculate energy measure values eg(l) at a BF g(l), where l is a unit number (l = 1, . . ., 32), and to nonlinearly compress the values. Thus, for each eg(l) in the model, the compressed energy measure components s(eg(l)) of units with BF g(l) (l = 1, . . ., 32) are calculated in all units. Next, by using a simple network model of nonlinear dynamical systems, we simulated the population activity of cortical neurons in the thalamo-recipient layers (layers 2/3 and 4) of the A1. We made the following four model assumptions regarding the feedforward thalamo-cortical projections and intracortical connections in each layer: (i) Neurons in the layers receive synaptic inputs from three major sources: thalamocortical, intracortical excitatory, and intracortical inhibitory inputs (Schreiner et al., 2000; Kaur et al., 2004; Winer et al., 2005; Liu et al., 2007). (ii) In response to PT bursts, excitatory synaptic inputs into the excitatory and inhibitory populations through the thalamocortical connections are similarly shaped in spectral

62

M. Noto et al. / Neuroscience 318 (2016) 58–83

integration (Wehr and Zador, 2003; Zhang et al., 2003; Froemke et al., 2007; Tan and Wehr, 2009); however, note that they are not necessary perfectly identical (Wu et al., 2008; Zhang et al., 2011). (iii) Short-range recurrent inputs are major contributors in intracortical connections (Adesnik and Scanziani, 2010; Oviedo et al., 2010), although long-range intercolumnar inputs are rather limited (for long-range horizontal inputs in layer 2/3, however, see Happel et al., 2010). (iv) Finally, synaptic depression of thalamocortical and intracolumnar connections has strong effects on excitatory and inhibitory population dynamics in the corresponding column, whereas the effects on intercolumnar connections are relatively weaker and restricted (Levy and Reyes, 2011). In the present model, for the array of 32 columns in the A1, the individual columns are labeled from 1 to 32 along the A1 tonotopic map in order of increasing corresponding BFs. In each column, for simplicity, individual excitatory and inhibitory neurons are respectively represented as two neural populations whose time-evolution rules are described in a nonlinear dynamical system stated below. A weighted average was derived for the spectral information of each unit and the averaged signals were then transmitted to the tonotopic array via the thalamocortical projections. The weighted average was carried out through excitatory synaptic weights Hfg and Kfg, which respectively represent weights projecting to excitatory and inhibitory populations in the A1 (Loebel et al., 2007; Levy and Reyes, 2011). The weights Hfg and Kfg are symmetrically distributed around a BF f, and the distributions, which are characterized by several parameters, are described as

Hfg ¼

range of 0.1–50 kHz (Sally and Kelly, 1988; Kilgard et al., 2001; Rutkowski et al., 2003). Therefore, on the basis of the reported Q-values, we determined that the bandwidth parameter values (e.g., 2rE) of the A1 model were twice those in the peripheral system model, indicating that the A1 bandwidths were CF-dependent and twice as broad than those in the auditory periphery. All parameters in Eqs. (1a) and (1b) are listed in Table 1. In this study, the population activity of excitatory or inhibitory neurons in an A1 column was modeled as a pair of nonlinear dynamical systems described as the following pair of coupled FitzHugh–Nagumo (FHN) equations (FitzHugh, 1961; Nagumo et al., 1962):

sE dmE;k =dt ¼ mE;k ðmE;k  aE ÞðmE;k  1Þ  wE;k þ IE;k ðtÞ;

ð2aÞ

dwE;k =dt ¼ eE ðmE;k  bE wE;k þ cE Þ;

ð2bÞ

and

sI dmI;k =dt ¼ mI;k ðmI;k  aI ÞðmI;k  1Þ  wI;k þ II;k ðtÞ;

ð3aÞ

dwI;k =dt ¼ eI ðmI;k  bI wI;k þ cI Þ;

ð3bÞ

where, for the k-th column, mE,k, wE,k, mI,k, and wI,k are dynamical variables and sE, sI, aE, aI, bE, bI, cE, cI, eE and eI are all model parameters. The subscripts E and I

Table 1. Parameters of the thalamo-cortical projection model Model component

 1=2  h i  H  ðfgÞ2 2 0 pffiffiffiffiffiffiffiffi  exp  ; for jf  gj < 2 r log  H 1 2 E  2pr2 H1  2r E 2pr2E ; E 0;

otherwise

Kfg ¼

> :

2pr2I

 1=2  h i  K  2 0  exp  ðfgÞ  K1 ; for jf  gj < 2r2I log pffiffiffiffiffiffiffiffi 2r2I 2pr2I K1  ; 0;

H0 H1 rE

1.0  103 5.0  105 b(f0)**

a.u.* a.u. Hz

K0 K1 rI

1.0  103 5.0  105 b(f0)*

a.u. a.u. Hz

10 20 1.0

ms ms a.u.

10 20 1.0

ms ms a.u.

srec Udep UE UI c

50 0.01 0.05 0.05 10

ms 1/ms a.u. a.u. a.u.

gE gI

1.0  104 1.0  104 5.0

a.u. a.u. ms

Thalamo-cortical synaptic dynamics: Excitatory populations ar ad as Inhibitory populations br bd bs

otherwise ð1bÞ

where g is a BF in a unit in the thalamus, and H0, H1, rE, K0, K1, and rI are all parameters and constant values. In Eqs. (1a) and (1b), the weights Hfg and Kfg applying to excitatory and inhibitory populations in the A1 are assumed to be normally distributed with standard deviations rE, and rI, respectively, although the functions are slightly shifted downward and negative values are ignored to reduce the calculations needed. The parameters rE and rI represent turning widths, and they are co-tuned in the present model (i.e., rE  rI) because of the second assumption (ii). Also, they are respectively assumed to be in proportion to the tuning curve bandwidths of excitatory and inhibitory neurons in one A1 column. Furthermore, for tuning curves in neurons of the rat A1, it has been reported that bandwidths at 10 dB above the threshold are around 1.0 octave, and Q-values (CFs over bandwidths) are around 2.0 in a frequency

Unit(s)

Inhibitory populations

ð1aÞ 8 > < pffiffiffiffiffiffiffiffi K0

Value

Thalamo-cortical projection: Excitatory populations

8 > < pffiffiffiffiffiffiffiffi H0 > :

Parameter

Depression

Synaptic noise

sOU * **

The abbreviation ‘‘a.u.” represents arbitrary unit. b(f0) is the bandwidth of the Gammatone filter. See Appendix A2.

63

M. Noto et al. / Neuroscience 318 (2016) 58–83

hereafter represent excitatory and inhibitory neural populations, respectively, and together these populations comprise the individual column (Fig. 9Ba). The subscript k is a unit number ranging from 1 to 32 (i.e., k = 1, . . ., 32). In Eqs. (2a)–(3b), we assumed that mE,k corresponded to the fractional fluorescent (DF/F0) data obtained from the optical recording due to the presence in the cortex of a larger population of excitatory neurons than inhibitory neurons (Xu et al., 2010; Meyer et al., 2011). Because mE,k may sometimes be negative with the data simulated by the above population model, the final output is thresholded as [mE,k]+ = max(mE,k, 0) when compared with observed waveforms of the optical recording; note that this does not affect the temporal evolution of (mE,k, wE,k) dictated by the equations. Furthermore, the variables wE,k and wI,k are interpreted as inactivation variables and correspond to the integrated past activation variables mE,k and mI,k, respectively. The variables wE,k and wI,k are not always observed directly and cannot be accessed in any way, so they are considered to be hidden dynamical variables, and the parameters bE, bI, cE, cI, eE, and eI can assume any values. In the field of neuroscience, single uncoupled FHN equations (i.e., Eqs. (2a, 2b) or (3a, 3b)) are usually considered to be models for action potential generation in a single neuron, but these equations offer flexibility for modeling a wide range of excitable systems. For example, Curto et al. reported that in the AC of anesthetized rats, population responses to simple sound stimuli could be successfully modeled by an individual set of FHN equations (Curto et al., 2009). Note that FHN equations with small values of eE and eI are in general classified into typical neural models with Class 2 excitability (Izhikevich, 2007). For instance, in FHN equations driven by a shortpulse current of IE,k (or II,k), the peak value of mE,k (or mI,k) is continuously increasing with increasing pulse amplitude, and thus the threshold of the activity is not well-defined (c.f., Fig. 9C; Izhikevich, 2007). In contrast, in neural models with Class 1 excitability, the spike threshold is well-defined (Izhikevich, 2007). Except for aE and aI, parameter values in the excitatory and inhibitory populations are identical, which may only provide the different threshold levels to be activated. That is, the inhibitory population in the present model generally has a slightly higher threshold value, because major subtypes of inhibitory interneurons (e.g, fast-spiking neurons) in the cortex have higher spike thresholds than excitatory pyramidal neurons (Tateno et al., 2004; Kloc and Maffei, 2014). All paired parameter values of the coupled FHN equations are listed in Table 2. In the parameter set, the two variables mE,k and wE,k (or mI,k and wI,k) evolve over fast and slow time scales, respectively, during suprathreshold input. Moreover, the slow inactivation variables wE,k and wI,k serve as negative feedback on the corresponding activation variables mE,k and mI,k (Eqs. (2a) and (3a)). Hence, with increased input levels, the activation variables exhibit refractoriness for the next activation, so that the effects mimic depressive (suppressive) responses in the populations within each column. Additionally, for the k-th A1 column, IE,k(t) in Eq. (2a) and II,k(t) in Eq. (3a) represent thalamocortical and/or intracortical inputs, and they are the only external terms

Table 2. Parameters of the coupled FHN model

*

Model parameter

Value

Unit(s)

sE sI aE aI bE bI cE cI eE eI

0.7 0.7 0.1 0.2 1.0 1.0 0 0 9.0  103 9.0  103

ms ms a.u.* a.u. a.u. a.u. a.u. a.u. a.u. a.u.

The abbreviation ‘‘a.u.” represents arbitrary unit.

driving mE,k and mI,k for the single column model. Each of IE,k(t) and II,k(t) is composed of five components, such that IE;k ðtÞ ¼ IE;k;TC ðtÞ þ I0E;k;IC ðtÞ þ I1E;k;IC ðtÞ þ I2E;k;IC ðtÞ þ gE nE ðtÞ;

ð4aÞ

II;k ðtÞ ¼ II;k;TC ðtÞ þ I0I;k;IC ðtÞ þ I1I;k;IC ðtÞ þ I2I;k;IC ðtÞ þ gI nI ðtÞ:

ð4bÞ

The first terms on the right-hand sides of Eqs. (4a) and (4b) are synaptic inputs with depression via thalamocortical projections, and they are described as IE;k;TC ¼

X HfðkÞgðlÞ JE;l xE;l ;

ð5aÞ

l

d2 JE;l =dt2 ¼ ðar þ ad ÞdJE;l =dt  ar ad JE;l þ as sðel Þ;

ð5bÞ

dxE;l =dt ¼ ð1  xE;l Þ=srec  Udep xE;l  hUE ðJE;l Þ;

ð5cÞ

and II;k;TC ¼

X KfðkÞgðlÞ JI;l xI;l

ð6aÞ

l

d2 JI;l =dt2 ¼ ðbr þ bd ÞdJI;l =dt  br bd JI;l þ bs sðel Þ

ð6bÞ

dxI;l =dt ¼ ð1  xI;l Þ=srec  Udep xI;l  hUI ðJI;l Þ;

ð6cÞ

where, for an l-th unit in the subcortical system, xE;l and xI;l are depression factors into the excitatory and inhibitory populations, respectively; JE;l and JI;l are thalamocortical current components; s(el) is the compressed energy measure from the unit l with a CF g(l) in the subcortical system; and ar, ad, as, br, bd, and bs are all parameters of the synaptic dynamics in the thalamocortical projections. In Eqs. (5b) or (6b), dynamics of current components in the excitatory (or inhibitory) population are modeled as an a-function with fast (rising phase) and slow (decaying phase) time constants ar (br) and ad (bd), respectively. For the details, see computational neuroscience textbooks (e.g., Koch, 1999; Ermentrout and Terman, 2010). Additionally, in Eqs. (5c) and (6c), depression is implemented using a phenomenological model proposed by Tsodyks and Markram (1997), although facilitation is not included. In Eqs. (5c) and (6c), hU(x) is a function such that hU(x) = c(x  U) for x P U and c is a constant; otherwise, hU(x) = 0. Therefore, in response to very small input with a level below UE or UI,

64

M. Noto et al. / Neuroscience 318 (2016) 58–83

the depression does not affect the model. Also, once the input (JE;l or JI;l ) to the depression factor model of Eqs. (5c) and (6c) is reduced below the level UE or UI after the onset of depression, the depression factors xE;l and xI;l recover toward their resting states. All parameter values in Eqs. (5) and (6) are listed in Table 1. Because the excitatory and inhibitory neural populations receive synaptic inputs through intracolumnar connections, the synaptic inputs I0E;k;IC ðtÞ and I0I;k;IC ðtÞ are sums of two contributions: (i) one from self-connections with a synaptic weight J0EE (i.e., for the connection of the excitatory population itself) and with a synaptic weight J0II (for that of the inhibitory population itself), and (ii) the other correspondingly from counterconnections with synaptic weights J0IE and J0EI (see Fig. 9Ba). Thus, I0E;k;IC ðtÞ and I0I;k;IC ðtÞ for the k-th column are described as, I0E;k;IC ¼ J0EE h0 ðmE;k Þ þ J0IE h0 ðmI;k Þ;

ð7aÞ

I0I;k;IC ¼ J0EI h0 ðmE;k Þ þ J0II h0 ðmI;k Þ;

ð7bÞ

where h0(x) is a function such that h0(x) = x for x P 0; otherwise, h0(x) = 0. Similarly, we assumed that intercolumnar connections to one column originate from the excitatory and inhibitory populations within its two nearest neighbor columns. That is, an excitatory (or inhibitory) population in column k has symmetric synaptic connections to the excitatory and inhibitory populations of columns k ± 1 and k ± 2 with connection weights J1EE (J1IE ), J1EI (J1II ), J2EE (J2IE ), and J2EI (J2II ), respectively (Fig. 9Bb). Therefore, intercolumnar synaptic j j inputs IE;k;IC ðtÞ and II;k;IC ðtÞ (j = 1, 2) into the k-th column are described as j IE;k;IC

j II;k;IC

  ¼ h0 ðmE;kj Þ þ h0 ðmE;kþj Þ   j þ JIE h0 ðmI;kj Þ þ h0 ðmI;kþj Þ ; j JEE

 ¼ h0 ðmE;kj Þ þ h0 ðmE;kþj Þ   þ JIIj h0 ðmI;kj Þ þ h0 ðmI;kþj Þ : j JEI

ð8aÞ



ð8bÞ

Because the structure and some connectivity between neural populations in the A1 model are similar to those proposed by Loebel et al. (2007), values of connection weights in our model were practically determined on the basis of the A1 model. However, the detailed structure of our model is different from that of Loebel et al.; for example, synaptic depression in intercolumnar connections is not explicitly included in our model. In addition, the connection weights in the present model are partially based on the connection probability between A1 neurons and synaptic conductances in the model proposed by Levy and Reyes (2011). To properly adjust other parameter values, we first numerically obtained a 2D-bifurcation diagram (Fig. 10Bb) to find activation and propagation properties in the whole network. Then, we finally fixed a set of parameters of the network whose activity was perfectly propagating so that the parameters in the diagram were located at a point near the border between two areas of the locally evoke activity and perfectly propagating activity (PA) (i.e., the point represented as ‘+’ in Fig. 10Bb); for more details, see Tables 1–3.

Finally, synaptic fluctuation as background noise is modeled as Ornstein–Uhlenbeck processes with zero means (Destexhe et al., 2001; see Appendix A in more details). To analyze the onset of population activity, activation times were used to describe evoked responses to stimulation of the excitatory population in each A1 column. By definition, activation times are time points at which each variable mE,j of excitatory populations for j = 1, . . ., 32 rises above the threshold that is 60% of the following post-stimulation peak. Response latency is defined as the time interval from the stimulation onset to the first activation time. In the 32 A1 columns, time series data of activation times are divided into time windows (called bins) of duration Dt, and inside each bin, the spatial distribution of activity over all columns represents a frame (Pasquale et al., 2008). In addition, to characterize global activity in the A1 model for all trials (80 stimulus representations) of PGP, PBP, and PNP stimuli, the total count of activation times in a frame over all columns, denoted by M(t), was calculated after summing the numbers of activation times in each column. A global response ratio is defined as Rglobal ¼ 100 Mðs2 Þ=Mðs1 Þ, where s1 and s2 are respectively represented as s1 ¼ arg max MðtÞ and s2 ¼ arg max MðtÞ, and tPTs

06t
Ts is the sum of the first tone-burst duration (50 ms) and the interval (dS, dB, or dN) of the two successive tone bursts; in other words, Ts is the onset time of the second tone burst. In column j, moreover, the numbers of activation times at times t1 and t2 over all trials are respectively represented as Nj(t1) and Nj(t2), where 0 6 t1 < Ts and t2 P Ts. A local response ratio (Rlocal) is defined as Rlocal ¼ 100 

X

,

Nj ðt13 2 Þ j¼12;13;14

X

Nj ðt13 1 Þ;

ð10Þ

j¼12;13;14

where t13 and t13 1 ¼ arg max N13 ðt1 Þ 2 ¼ arg max N13 ðt2 Þ. t2 PTs

06t1
Note that because the BF of column 13 is 8 kHz, Rlocal represents the response ratio between the first and Table 3. Parameters of the cortical network model Model component

Parameter

Value

Intracolumnar connection weight J0EE J0EI J0IE J0II

6.5  102 3.2  102 0.11 2.0  102

Intercolumnar connection weight Nearest neighbors J1EE J1EI J1IE J1II

2.3  102 1.1  102 1.9  102 3.5  103

Second Nearest neighbors J2EE

2.3  103

J2EI J2IE J2II

1.9  103

1.1  103 3.5  104

M. Noto et al. / Neuroscience 318 (2016) 58–83

second responses in respect to the sums of the column tuned for 8-kHz tone bursts and the nearest neighbors to the tone bursts. Statistical analyses All statistics and error bars are reported as mean ± standard error of the mean (SEM). Statistical significance was assessed with the paired two-tailed Student’s t-test unless otherwise noted. In addition, Bonferroni’s multiple comparison test was performed, if necessary, using Matlab (MathWorks, USA). Values of P less than 0.05 were considered statistically significant.

RESULTS Multiple auditory fields Here, we first describe how five auditory fields in the core and belt areas were determined by differences in tonotopic or non-tonotopic organization and response latency. At the beginning of each experiment, the primary field A1 and several other unknown auditory areas were activated in response to tone-burst stimulation (Fig. 1A). To reconstruct tonotopic maps, the response areas of different frequencies (1, 2, 4, 8, and 16 kHz) were clipped at 60% of their peak responses (Fig. 1Ba–e), and the latency from the onset of stimulation of the peak response was calculated (Fig. 2B and Table 4). On the basis of the response pattern, the response areas were superimposed onto each other from low to high frequencies (Fig. 1Bf). From these patterns, the core fields A1 and the anterior auditory field (AAF) were easily recognized by their mirrorimaged tonotopic organization (Fig. 1Bf). Briefly, the method we used to determine the tonotopic maps was as follows: (i) the center positions and surrounding central islands of the initial response patterns to tone-bursts at each frequency were determined (e.g., Fig. 1Ba–e); (ii) a tonotopic axis perpendicular to isofrequency lines on the center positions was drawn to approximately represent the center positions in a line (i.e., a linear approximation); (iii) the central islands were inflated in the direction of the isofrequency lines and perpendicular to the tonotopic axis; and finally, (iv) to determine borders between the A1 and AAF, two ellipses respectively covering each set of the extended islands of the A1 and AAF was drawn until they overlapped. However, the inflation toward the high frequencies beyond 16 kHz was somewhat arbitrary because we did not record responses representing frequencies over 16 kHz in these experiments. Thus, as shown in Fig. 1Bf, the AAF was located at the anterior area of the A1. In the A1 and AAF, the response area of each frequency had a roughly dorsoventral bandlike appearance. In the AAF, the lower frequencies were located anteriorly and the higher frequency posteriorly, and vice versa for the A1. The response bands overlapped to some extent (around 500 ± 100 lm at 70 dB SPL). The overlap of the response bands to the different PTs in the A1 was larger than that in the AAF,

65

and the borders between bands in the AAF were more obscure than those in the A1. The response patterns in the belt area surrounding the A1 and AAF had more spot-like properties than those in the A1 and AAF. On the basis of the type of organization (tonotopic or non-tonotopic) and the response latency and duration, up to five fields, including the A1 and AAF, could be tentatively identified in the core and belt areas of rats in electrophysiological studies (e.g., Polley et al., 2007). According to the dorsoventral and anterior–posterior positions of these fields in the belt area, they were previously named the ventral auditory field (VAF), posterior auditory field (PAF), and anterior ventral auditory field (AVAF) (Figs. 1C, 2A, B), although the PAF was classified as part of the core area in some reports (Doron et al., 2002). However, to precisely define the core and belt areas is outside the scope of this study, because many electrophysiological studies (e.g., Doron et al., 2002) have shown that the PAF shares features of both the core and belt areas; i.e., somewhat more broadly tuned cells, longer response latencies, and less pronounced tonotopy (for a review, see Kaas, 2011). In the present study, we classified PAF as part of the belt area due to longer response latency (see below), but this classification was not critical for the following analysis. In a tonotopic organization of the belt region, the direction of the frequency axis in the VAF was similar to that in the A1, and those in the PAF were similar to that in the AAF (Fig. 1C). However, not all belt fields were organized tonotopically in all animals, particularly the AVAF. Thus, in response to PTs, after excluding the response areas in the A1 and AAF, borders of the fields in the belt area were based on response patterns in other response areas. Patterns of activation corresponding to these multiple fields were observed in all eight animals, but showed some variation in their shapes and exact locations. Furthermore, the response latency and duration differed among the multiple auditory fields (Fig. 2). Short-latency responses were seen first in the core fields A1 and AAF (35.8 ± 2.2 ms and 34.1 ± 2.0 ms, respectively, for n = 8; Fig. 2Ab1,b2 and Table 4). The VAF and AVAF were activated with a slightly longer latency (57.2 ± 2.3 ms and 54.3 ± 3.1 ms, respectively; Fig. 2Ab3,b5 and Table 4). After this, the activity appeared in the PAF (63.9 ± 3.0 ms), as shown in Fig. 2Ab4. The duration of the response was longest in the VAF (86.8 ± 6.1 ms), shorter in the A1 and AAF (85.6 ± 5.4 and 86.0 ± 5.2 ms), and shortest in the AVAF and PAF (79.9 ± 5.9 and 77.5 ± 5.5 ms), although there were some variations from animal to animal. All the results are summarized in Table 4. With respect to the frequency dependency of the response latency, each area and field had specific properties (Fig. 2B). In response to PT bursts with lower frequencies (e.g., 1 and 2 kHz), the PAF was activated with a latency longer than 80 ms after the stimulation onset (Fig. 2Ba,b). In contrast, in response to PT bursts with higher frequencies (8 and 16 kHz), the VAF and AVAF were activated at a longer latency (Fig. 2Bd,e). Thus, areas in the belt field were also characterized by

66

M. Noto et al. / Neuroscience 318 (2016) 58–83

Fig. 2. (A) In (a), a surface blood vessel pattern captured by the CMOS camera during optical imaging. Five dashed ellipses indicate the A1 (blue), AAF (red), VAF (yellow), PAF (green), and AVAF (orange). In (b1–5), five time courses of the response to an 8-kHz tone-burst at five sites in the five fields are shown. The five recording sites are indicated in (a) by dotted lines and small circles. The five short horizontal bars show the stimulus timing. (B) In (a–e), schematic areas of the A1 and AAF, which were obtained from the tonotopic organization and response latency maps from the onset of the stimulation, are superimposed on each figure. The latency (20–80 ms) in response to 1–16 kHz tone stimuli is color coded as indicated by the bar on the bottom of the figure. In (f), three belt fields (VAF, PAF, and AVAF) are added on the basis of the results of latency maps. This activation map is identical to Fig. 1Bf. Abbreviations are the same as those in Fig. 1. Note the activation of multiple auditory areas with different response delays and duration. All recordings were from the same animal in Sample #06. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Table 4. Latency and duration of the responses in five fields Field Latency in ms Duration in ms

AI 35.8 ± 2.2 85.6 ± 5.4

AAF 34.1 ± 2.0 86.0 ± 5.2

VAF

PAF *

57.2 ± 2.3 86.8 ± 6.1

AVAF *

63.9 ± 3.0 77.5 ± 5.5

54.3 ± 3.1* 79.9 ± 5.9

Data are mean ± SEM for eight animals (n = 8). * P < 0.05 vs. AI by t-test.

differences in frequency dependency and response latency. Responses to silent gap-inserted tones After we identified the multiple auditory fields in the core and belt areas, we sought to understand the neural

correlates of auditory (temporal) induction (Warren et al., 1972; Warren and Bashford, 1999) by investigating response patterns to the three types of stimuli: PGP, PBP, and PNP (see Experimental procedures). First, neural responses to PT bursts with a silent gap (PGPs) were recorded and analyzed. Overall, the responses depended

M. Noto et al. / Neuroscience 318 (2016) 58–83

on the length (dS) of the silent gap. In the experiments, the second responses corresponded to the onset of the second tone after the gap, but not the onset of the gap (Fig. 3A; see also Kubota et al., 2012). The onset times of the second response varied according to the onset times of the second tone, as described below. Thus, the second responses are likely to indicate the presence of the second tone stimulus after the gap. Evoked responses in the core and belt fields resulting from PGPs with two-gap durations (dS = 120 and 180 ms) are shown

67

in Fig. 3B. In these fields, when the length of the silent gap was less than 100 ms, PGPs elicited first phasic responses but no second responses (Fig. 7B). Even when the length of the silent gap was 120 ms, the second responses were less than half the intensity of the first responses (Figs. 3B1, 7B). In contrast, when the length of a silent gap was longer than 180 ms, PGPs elicited second phasic responses that were around 80% of the intensity of the first phasic responses in the A1 30–40 ms after the onset of the second tone burst (Figs. 3C2, 7). The

Fig. 3. (A) In (a), a surface view captured by the CMOS camera during optical imaging. Five dashed ellipses indicate the auditory areas: A1 (blue), AAF (red), VAF (yellow), PAF (green), and AVAF (orange). In (b1–5), in response to the stimulus of two 8-kHz tone-bursts with a silent gap dS = 120 ms (PGP), five time courses of the response at five sites in the five fields are shown. The five recording sites are indicated in (a) by dotted lines and small circles. Five pairs of two horizontal bars also show the stimulus timing of the two tone bursts. Data were obtained from an animal in Sample #05, which was different from the animal shown in Fig. 2. (B) Response patterns to PGP. Durations (dS) of the silence gap were respectively 120 and 180 ms in B1 and B2. In the leftmost panel, five schematic areas of the A1, AAF, VAF, PAF, AVAF, which were obtained from the tonotopic organization as in A(a), are superimposed just before the onset of the sound simulation. The response amplitude ratios to the maximum response are color coded as indicated by the bar on the right. Times after the onset of the first or second tone-burst (60 and 90 ms) are shown as numbers in ms at the top for each panel. (C) For the AC of the same animal, in response to PGP stimuli, spatial patterns of the local maximum responses (R1 and R2) after the first (a) and second (b) tone bursts and the ratio (R2/R1, (c)) between them are shown. R1 and R2 represent the relative ratios of the global maximum response to the first tone burst. In C1, the length (dS) of the silent gap was 120 ms as indicated on the left. The responses in the core areas are larger than those in the belt fields. C2, represents similar information as C1, but with dS = 180 ms. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

68

M. Noto et al. / Neuroscience 318 (2016) 58–83

minimum length of the silent gap that elicited second responses was 10–20 ms higher in the belt fields than in the core fields. In the VAF and PAF, furthermore, the second responses were less than half the intensity of the first responses for the silent gap of 120 ms (Figs. 3C1, 7B). However, the second responses increased in intensity as the length of the silent gap increased. When the length of the gap increased from 140 to 180 ms, the second responses were elicited in all fields and the intensities were over half those of the first responses in all the fields (Figs. 3C2, 7C–E). In particular, a response difference between AVAF and VAF was found because neural responses were activated intensively in the AVAF (c.f., Figs. 3C2, 7D). Responses to BN-inserted tones Next, to determine the neural representation of auditory induction in the rat AC, we investigated response patterns to PBP stimulation. BN covering the frequency components of a tone burst used before and after the silent gap is known to cause auditory (temporal) induction (Warren et al., 1972; Warren and Bashford, 1999). That is, if the second responses indicate the presence of the second tone after the gap, reduction of the second responses would be expected when the silent gap is replaced with BN. Fig. 4A shows such an example of BN affecting the second responses (c.f., Fig. 3A). As expected, BN-inserted tones (PBPs) increased the minimum length of the gap at which the second responses were elicited (Fig. 4B). In both the core and belt fields, PBPs also considerably reduced the intensities of the second responses, especially at gap lengths (dB) less than 160 ms (i.e., dB 6 160; Fig. 4Ba,b). In the A1 and AAF of the core area, in particular, almost no second responses were elicited when the gap length dB was less than or equal to 160 ms. Also, in the VAF and PAF, less than 40% of second responses were elicited when the gap length was less than 180 ms (Figs. 4Ba–b, 7B–D). Thus, for all the gap length conditions, small second responses were evoked in the VAF and PAF. When the noise length was increased over 180 ms, second responses were elicited in the A1, AAF, and AVAF, although a patchy activation pattern in the PAF emerged beginning 90 ms after the second tone in some animals for dB = 200 ms (Figs. 4Bc, 7E). In the AVAF, second responses were particularly sustained over 30 ms. In all experiments, these tendencies were similar in all eight animals with respect to the response amplitude ratio (R2/R1); typical examples from three animals are shown in Fig. 5. However, there existed some variations in activation patterns from animal to animal owing to specific locations of the multiple auditory fields. Responses to NN-inserted tones Finally, to gain insights into the neural mechanism of auditory induction in the AC, we investigated response patterns to PNP stimulation. If second responses indicate the presence of the second tone after the gap, smaller second responses would be observed when NN, rather than BN, is added to the silent gap. Because NN

lacks the frequency components of the tone used before and after the gap, the phenomenon of a decreasing (increasing) the second response is considered to be a reduction (enhancement) of the continuity illusion or that of the auditory continuity effects (Warren et al., 1972). This phenomenon has already been reported in other species, namely cats (Sugita, 1997), monkeys (Petkov et al., 2003), guinea pigs (Kubota et al., 2012), and gerbils (Kobayasi et al., 2012). In such experiments, because one of the critical parameters was the bandwidth of NN (3 dB from the baseline sound pressure level), we used 8-kHz bandwidth (4 kHz below and above a centered frequency of 8 kHz) on the basis of the results obtained from the guinea pig AC (Kubota et al., 2012). Overall, the second responses to PNP stimulation clearly emerged both in the core and belt fields when the noise length was over 140 ms (Figs. 6A, 8B). When the second responses to PNP were compared with those to PGP, the amplitude ratio curves in response to PNP were similar in shape to those in response to PGP (e.g., d = 160 ms in Figs. 6Bb, 8A). Thus, under this particular condition, PNP partially reversed the suppressive effects of PBP on the second responses, as expected. In the A1 and AAF of the core area, PBP elicited no second responses when the gap length (dB) was less than 160 ms (Fig. 7B–D). In contrast, in the AAF, PNP induced second responses when the gap length (dN) was over 140 ms (Fig. 7C–E). In addition, in the belt fields, PGP elicited no second responses when the gap length (dG) was less than 140 ms. However, PNP elicited activation locally in the VAF and AVAF under the same condition. The results regarding the amplitude ratios between the first and second response peaks to PGP, PNP, and PBP are summarized in Fig. 8B. The amplitudes of the second responses were generally higher with the silent gap (PGP) than with PNP and PBP. When the NN length (dN) was more than 140 ms, the second responses following PNP were monotonically greater in all fields compared with PBP. In addition, when using the NN, the amplitude reduction of the second responses varied among the belt fields. That is, in the AVAF and PAF, the reduction was unclear for almost all noise lengths. In contrast, amplitude reduction in the VAF can be seen over 160 ms, although the variance around the average is relatively large. Computational model mimicking the population activity underlying auditory induction The activity patterns in the AC differed in response to the PGP, PBP, and PNP stimulation described above. To understand the neuronal mechanisms underlying these differences, we constructed a simple network model to mimic the activity patterns (see Experimental procedures and Appendix A). Although this is a phenomenological model that is not based fully on the physical and physiological details of the auditory pathways, the structure consists of three auditory subsystems: peripheral, subcortical, and cortical networks (Fig. 9A). From the peripheral subsystem to the subcortical subsystem, there exist only ascending

M. Noto et al. / Neuroscience 318 (2016) 58–83

69

Fig. 4. (A) In (a), a surface view captured by the CMOS camera during optical imaging. Five dashed ellipses indicate the auditory areas: A1 (blue), AAF (red), VAF (yellow), PAF (green), and AVAF (orange). In (b1–5), in response to the stimulus of two 8-kHz tone-bursts with a broadband noise (PBP), five time courses of the response at five sites in the five fields are shown. The five recording sites are indicated in (a) by lines and small circles. The stimulus timings of the PT bursts and BN are indicated by black lines and gray boxes, respectively. Data were obtained from the same animal shown in Fig. 3. (B) Response patterns to PBP. Inter-burst BN durations (dB) were respectively 120, 160, and 200 ms in (a), (b), and (c). In the leftmost panel, schematic areas of the five fields, which were obtained from the tonotopic organization and latencies as in (A), are superimposed just before the sound stimulation. The response amplitude ratios to the maximum response are color coded as indicated by the bar on the bottom of the figure. Times after the onset of the first or second tone-bursts (60 and 90 ms) are shown as numbers in ms at the top for each panel. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

(feedforward) projections and no recurrent or descending connections. In the peripheral subsystem, sound signals are filtered by a bandpass filter emulating the outer and middle ears, and the output is subsequently filtered by a bank of 32 GFs, which can mimic cochlear functions such as band-limited filtering and tonotopic mapping (see Experimental procedures and Appendix A). Each GF (called individual channels in Fig. 9A) linearly processes the frequency components of input sound signals. Next, in the subcortical subsystem, a set of energy measures egi at a BF gi (i = 1, . . ., 32) is

calculated as the output of an array consisting of 32 units (Fig. 9A). The energy measures give a spatial representation of powers (square of frequency components) contained in the sound signal at the corresponding place (or BF unit). Subsequently, the energy measures are nonlinearly compressed at each unit (see Appendix A in more details). In the cortical model, an array of 32 columns is located in the A1, and the individual columns are similarly labeled from 1 to 32; each label represents the column placement along the A1 tonotopic map, corresponding to BFs at

70

M. Noto et al. / Neuroscience 318 (2016) 58–83

Fig. 5. Spatial patterns of the response ratio (R2/R1) are shown for three different animals: Rats #1, #2, and #3 with three inter-burst BN durations (dB = 120, 160, and 200 ms) of the BN between the PT bursts. (A) Rat #1. In the leftmost panel, five schematic fields (A1, AAF, VAF, PAF, and AVAF), which were obtained from the tonotopic organization and the latency of the response, are superimposed on the surface image during the recording. In (b), (c), and (d), the durations (dB) of the broadband noise are shown as numbers in ms at the top for each panel, and the ratio (R2/R1) is color coded as indicated by the bar on the bottom of the figure. (B) Rat #2. Panels are represented in the same way as those in (A). (C) Rat #3. Panels are represented in the same way as those in (A) and (B). When the duration (dB) was over 160 ms, the R2/R1 ratio in the AAF and AVAF was larger than in other fields and the response was almost equal to the maximum, indicating that the second tone burst evoked almost the same level of neural activity as the first tone burst did. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

frequencies ranging from 1.0 to 30 kHz (Fig. 9A). In addition, the compressed energy measure in the thalamic layer is fed forward in a BF-specific one-tomany manner into the column array in the A1 recipient layers via thalamocortical projections. Inversely, this means that one column in the A1 array receives the feedforward inputs from many units in the thalamic layers. In each column, for simplicity, our model specifies that the activities of excitatory and inhibitory neurons are represented as a pair of coupled neural populations (indicated by black and white circles in Fig. 9A, B), whose time-evolution rules are described as a nonlinear dynamical system (see also Experimental procedures). In thalamocortical projections, weighted averages of the individual inputs from the thalamic layers were computed using weight values Hfg and Kfg for the excitatory and inhibitory populations, respectively (Fig. 9A). The weight values were distributed in accordance with a Gaussian function, and the weights of the excitatory and inhibitory populations were co-tuned (see Experimental procedures). In the thalamocortical synaptic connections, synaptic depression was explicitly incorporated into the present network model, which was a modified version of the model proposed by Tsodyks and Markram (1997).

In addition, neural populations in one column receive synaptic inputs, via intracortical connections, from themselves and from populations in the two nearest adjacent columns (Fig. 9Ba–b). Concretely, in intracolumnar connections, the excitatory and inhibitory populations receive two contributions: (i) one from selfconnections (both excitatory-to-excitatory and inhibitoryto-inhibitory connections) and (ii) the other, correspondingly, from other counter-connections (see Experimental procedures and Fig. 9Ba). Furthermore, we assumed that the intercolumnar connections to each column originate from the excitatory and inhibitory populations in its two nearest neighbor columns. That is, one excitatory (or inhibitory) population in column k has symmetric synaptic connections to both excitatory and inhibitory populations of columns k ± 1 and k ± 2 (Fig. 9Bb). In both the intra- and inter-columnar connections, synaptic depression was not directly incorporated into the model. However, phenomena reminiscent of synaptic depression occur in the neural population dynamics within each column, as stated below. As discussed above, auditory induction is considered to be nonlinear phenomenon. Therefore, we considered modeling the activity of neural populations in one A1 column as a nonlinear and excitable dynamical system.

M. Noto et al. / Neuroscience 318 (2016) 58–83

71

Fig. 6. (A) In (a), a surface view captured by the CMOS camera during optical imaging. Five dashed ellipses indicate the auditory areas: A1 (blue), AAF (red), VAF (yellow), PAF (green), and AVAF (orange). In (b1–5), in response to the stimulus of two 8-kHz tone-bursts with a notched noise (PNP), five time courses of the response at five sites of the five fields are shown. The five recording sites are indicated in (a) by lines and small circles. The time scale also shows the stimulus timing of the PT bursts and NN. Data were obtained from an animal in Sample #07. (B) Evoked response patterns to PNP. Durations (dN) of the notched noise were respectively 120, 160, and 200 ms in (a), (b), and (c). In the leftmost panel, schematic areas of the five fields, which were obtained from the tonotopic organization as in (A), are superimposed just before the PNP stimulation. The response amplitude ratios to the maximum response are color coded as indicated by the bar on the bottom of the figure. Times after the onset of the first or second tone-bursts (60 and 90 ms) are shown as numbers in ms at the top for each panel. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

In the literature, there are many families of excitable system models (e.g., see Izhikevich, 2007). Of these, we chose the FHN equations (FitzHugh, 1961) because they are one of the simplest forms, with just two variables, and are flexible enough to allow a wide range of linear and nonlinear dynamics (for a review, see Koch, 1999). Also, the FHN equations have been successively used to model evoked activity and its dynamics in response to simple and short sound stimuli in anesthetized rats (Curto et al., 2009). In this study, however, the selection of a specific nonlinear dynamical model was not critical because subsequent inactivation after the initial evoked activity in excitable systems is essential. As described in the Experimental procedures, the coupled FHN equations

were used as the population model in each iso-frequency column, and the population models were interconnected between their two nearest neighbor columns (Fig. 9Bb). To mimic the evoked population activity in response to the PGP stimuli, parameters of the FHN network model were first adjusted in response to a much simpler stimulus. Fig. 9C shows waveforms of the uncoupled FHN model (i.e., the single neural population of Eqs. (2) or (3)) in response to two successive step-like pulses with three different interpulse intervals (DI = 60, 90, and 120 ms). The amplitude of the evoked response to the second pulse with DI = 120 is similar to that of the first pulse (Fig. 9Ca). In contrast, because decreasing the interpulse interval gradually reduces the amplitude of

72

M. Noto et al. / Neuroscience 318 (2016) 58–83

Fig. 7. (A) A surface view captured by the CMOS camera during optical imaging. Five dashed ellipses indicate the auditory areas: A1 (blue), AAF (red), VAF (yellow), PAF (green), and AVAF (orange). (B) In response to the three types of stimuli (PGP, PBP, and PNP) for the same interval duration (gap, d = dS, dB, or dN) of 120 ms, spatial patterns of the response ratio (R2/R1) are shown. The ratio is color coded as indicated by the bar on the top right of the panels in (B). In (C), (D), and (E), the gaps (d) were respectively 140, 160, and 180 ms. Panels in (C–E) are represented in the same way as those in (B). Data were obtained from an animal in Sample #07. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

the second response (e.g., DI = 90 and 60 ms in Fig. 9Cb,c), the second response is not completely a socalled ‘‘all-or-none” phenomenon. Although the FHN dynamics constitute a simplified and phenomenological model, we consider that one of the mechanisms underlying the refractoriness of the second response is synaptic depression within intracolumnar connections in each column (see Discussion), which is represented as activation of a slow w-variable in the dynamical model. To explain the characteristics of the dynamics, the three evoked response trajectories of the uncoupled FHN model and the m- and w-nullclines are illustrated on the m–w phase plane (Fig. 9Cd). By adding a small positive current (I = 0.022) from the resting state (I = 0), the m-nullcline is shifted slightly upward, so that the equilibrium point moves from F1 to F2 and the phase portrait is also changed (not shown here). Therefore, as applying a small positive current can increase the w-variable and prolong the inactivation period in the present model, longer interpulse intervals are needed to

attain larger amplitudes to evoke the same amplitude as the second response (c.f., Fig. 9Ca). We believe that this simple explanation will be the key to understanding the dynamics of the auditory induction phenomena. Next, we consider the whole network model and focus on one iso-frequency column that is part of the interconnected network of excitatory and inhibitory neurons, with activity-dependent synaptic depression from thalamic output. In the network model, individual excitatory and inhibitory neural populations in the isofrequency column receive input through the frequencyspecific thalamic output. In response to a tone-burst stimulus, Fig. 10A shows how excitatory and inhibitory populations in an iso-frequency column (Ch. 13 at BF = 8 kHz) in the A1 network model respond to the thalamocortical input with two different levels of excitatory intensity. Previous studies reported that in the A1 the excitatory and inhibitory components of the synaptic conductance are almost co-tuned, but that the initial excitatory conductance is consistently followed by

M. Noto et al. / Neuroscience 318 (2016) 58–83

73

Fig. 8. (A) Waveforms in response to PGP, PBP, and PNP are shown for all fields: A1 (a) and AAF (b) of the core fields and VAF (c), PAF (d), and AVAF (e) of the belt fields. For all sound stimuli, the time between the two tone bursts was 160 ms (i.e., dS, dB, and dN = 160 ms). The durations of the two tone bursts are indicated by two bars, respectively. Note that for the PBP and PNP stimuli, the BN and NN were inserted between the tone bursts. B, Amplitude ratios (R2/R1) of the second tone-burst responses (R2) to the first tone-burst responses (R1) for PGP, PBP, and PNP stimuli as a function of the time length (dS, dB, or dN) of the gap or noise of the five fields. Circles, squares, and triangles respectively represent ratios in response to PGP, PNP, and PBP. Data from two animals were averaged for a total of 90 trials per condition. Each error bar shows SEM. For PGP vs. PBP, PGP vs. PNP, and PNP vs. PBP with the gap or noise length over 100 ms, *, y, and à respectively represent P < 0.05 by Bonferroni’s multiple comparison test.

inhibition with a delay of a few milliseconds (Wehr and Zador, 2003; Zhang et al., 2003; Las et al., 2005). Our model phenomenologically demonstrates these observations (last two panels from the bottom in Fig. 10Aa), because the population activity is triggered within the excitatory population and this activity subsequently recruits the inhibitory neurons of the same column. Moreover, in response to the same sound stimulation (e.g., 8-kHz tone-burst stimulus in Fig. 10Aa), the latency of the excitatory population activity after the stimulation onset depends on both the intensity of the thalamocortical input (i.e., sound intensity) and the sound frequency (Fig. 10Ab). Here, the latency was defined as the time interval from the stimulation onset to the time when the population activity crossed the threshold (60% of the maximum peak, Fig. 10Aa4,5). Furthermore, across the network, the balance between excitatory and inhibitory intensities of localized

thalamocortical input can determine three types of network states over activation threshold: locally evoked activity (LA, Fig. 10Ba1), perfectly PA (Fig. 10Ba2), and no activation (NA). In the two-parameter space of the excitatory and inhibitory intensities, Fig. 10Bb shows regions where we observed such PA in the model network. In the following analyses, we fixed the relationship between the tone-burst intensity (70dB SPL) and the corresponding thalamocortical input intensity and used the resulting values as the default parameters indicated by ‘‘+” in Fig. 10Bb. In response to 80 repetitions of PGP and PBP stimulation, the output waveforms of the averaged energy measure (s(ei) for i = 1, . . ., 32) are shown in Fig. 11Aa, Ba, respectively. In the stimuli, a sinusoidal signal at 8 kHz was a major component, so that large transient increases were seen twice around several columns centered at column 13 (at the BF  8 kHz)

74

M. Noto et al. / Neuroscience 318 (2016) 58–83

Fig. 9. (A) Simple block diagram of the computational model of three processing stages. The sound is processed from left to right. The first stage is based on the physical and psychophysical properties of the auditory periphery. After the sound signal is filtered by a bandpass filter emulating the outer and middle ears, the output is subsequently filtered by a bank of bandpass filters mimicking cochlear functions (i.e., GFs). Next, the individual energy measure of each filter is calculated as the output. This energy measure gives a spatial representation of the sound spectra in terms of the GF bank. After calculating the static compression of the energy measure for the k-th A1 column at a center (best) frequency f, weighted averages for the components at a center (best) frequency gi (i = 1, . . ., n) calculated in the subcortical system are determined using weights Hfgi and Kfgi, where subscript i depends on the tuning curve bandwidth of cells in the A1 column (see Appendix A). As the result of thalamocortical projections, the weighted averages are obtained for the input into excitatory and inhibitory neural populations in the cortical iso-frequency column (see text for further detail). (B) In (a), schematic connections between excitatory and inhibitory neural populations within one iso-frequency column (column k) are shown. Subscripts E and I represent ‘‘excitatory” and ‘‘inhibitory,” respectively. In the connections within the excitatory and inhibitory populations, weights of the self-connections are represented by J0EE and J0II, respectively; across the individual populations, those of the counter-connections are J0EI and J0IE. In (b), intercolumnar connections between the nearest and second nearest excitatory and inhibitory neural populations are schematically illustrated. Individual populations in one iso-frequency column around column k are interconnected with those in the two nearest columns on both sides (columns k  2, k  1, k + 1, and k + 2). (C) Dynamics of the uncoupled FHN model. In (a–c), temporal responses of an uncoupled neural population in one column are shown in response to step-like pulses (50-ms pulse width) with three different interpulse intervals (DI = 120, 90, and 60 ms), respectively. The intensity of the first pulse was large enough to induce suprathreshold activation. In (d), a phase plane explaining (a–c). Applying a small current step (I = 0.022) causes an upward shift of the v-nullcline and changes the refractoriness of the second response, because the inactivation variable (w) is increased. The fixed points F1 and F2 are stable nodes (the resting states). The points P, Q, and R correspond to the time points indicated by the arrows in (a), (b), and (c), respectively.

during two successive tone bursts. In addition, since other frequency components below 16 kHz were also included during intervals between the tone bursts in the PBP

stimulus, irregular increases of s(ei) were seen (Fig. 11Ba). In particular, because of spectral characteristics the two filters of the periphery (i.e., outer

M. Noto et al. / Neuroscience 318 (2016) 58–83

75

Fig. 10. (A) Response properties of one column (or unit) in the network model, whose BF is 8-kHz. In (a1), an 8-kHz tone-burst with a short duration (50 ms) is illustrated, with the activity patterns propagating from the periphery (a2) to the A1 subsystem (a4 and a5). In (a2), in response to toneburst stimuli with two different sound pressure levels, two waveforms of energy measures (e13) at unit 13 (BF = 8.0 kHz), corresponding to the two responses of different excitatory intensities to thalamocortical input (for IE = 10.0, thin line, and for IE = 12.0, thick line in a3), are shown. For all simulations, the inhibitory intensity value was fixed at II = 6.0. In an array of A1 columns, the intensity of thalamocortical input is represented as the latency of the responses in excitatory (thick lines) and inhibitory (thin lines) neural populations; in (a4) and (a5), responses are shown for IE = 12.0 and IE = 10.0, respectively. To calculate the latency of the activity, the activation time of each excitation population was defined as the time when the population variable first reached a threshold (60% of the maximum peak in the spike-like activity). In (b), relationships between latency to the tone-burst input and the excitatory intensity of the thalamocortical input are shown for excitatory neural populations at columns 10 (BF = 6.1 kHz), 13 (BF = 8.0 kHz), and 17 (BF = 10.4 kHz). Because of activity propagation from column 13 to other columns, the response latency of other columns is longer than that of column 13. (B) Responses of the A1 network model and dependency on the two parameters from the thalamic input. In (a1,2), two raster plots illustrate LA and PA of the array of A1 columns, respectively, in response to thalamocortical input evoked by the 8-kHz tone burst. In (b), a two-parameter bifurcation diagram of local and global activity in the array is illustrated. The two parameters are the excitatory and inhibitory intensities of thalamocortical input. White, light gray, and dark gray areas show NA, LA, and PA subregions, respectively. In addition, the parameter values of the thalamocortical input for (a1) and (a2) are respectively indicated as triangle (N) and plus (+) marks in the diagram.

and middle ears), higher frequency components (>8 kHz) were more dominant than lower frequency components. Moreover, during the interval between two successive tone bursts, the averaged energy measure of frequency components around 8 kHz to the PNP stimulus was smaller than that to the PBP stimulus, although higher frequency components to the PNP stimulus existed to some extent as well as the PGP stimulus (data not shown).

Similarly, in response to 80 repetitions of the PGP stimuli with different inter-burst intervals (dG = 100, 140, 180, and 220 ms), Fig. 11Ab illustrates color (gray) map plots of average A1 network activity, showing the spread of average A1 activity patterns over the array of the 32 columns. When dG was less than 100 ms, the network activity of the second response tended to be suppressed at several columns with lower BFs, which was reflected by the high-pass filter properties of the

76

M. Noto et al. / Neuroscience 318 (2016) 58–83

periphery (Fig. 11Ab1). If dG was over 140 ms, however, PA was often seen (Fig. 11Ab2–4). In contrast, in response to a PBP stimulus with dB = 100 ms, the responses to the second tone-bursts were completely suppressed (Fig. 11Bb1). When the inter-burst interval

was increased to dB = 140 or 180 ms (Fig. 11Bb2,3), LA appeared. LA was often seen with dB = 180, especially at columns with higher BFs, although PA occasionally occurred (Fig. 11Bb3). Furthermore, after prolonging the interval dB to 220 ms, PA over the whole

M. Noto et al. / Neuroscience 318 (2016) 58–83

77

network was almost always observed (Fig. 11Bb4). Furthermore, in response to the PNP stimulus, LA mostly appeared in the network when dN was not less than 140 ms. On the other hand, when dN was over 160 ms in the corresponding stimulus, PA often occurred (data not shown). All the simulation results are summarized in Fig. 12 using local and global response ratios (Rlocal and Rglobal). Briefly, in response to PBP stimulation with conditions in which dB 6 180 ms, the local and global second responses were suppressed with high probability, so that the local and global response ratios were profoundly reduced, especially in the global response propagation (Rglobal < 40%). In contrast, PGP stimulation hardly induced the second response suppression (Rglobal P 80%) over the network. In the locally evoked second responses during the middle range of interburst intervals (140 6 d 6 180 ms) of PBP and PBP stimuli, the columns with higher BFs over 8 kHz were more likely to be activated than the columns with lower BFs. Hence, relationships between global response ratios as a function of the interval were characterized by greater nonlinearity than local response ratios.

DISCUSSION Tonotopic organization and auditory fields in rats Using optical imaging of a voltage-sensitive dye, we recorded the spatiotemporal activity of the rat AC. On the basis of the combined criteria of tonotopic or nontonotopic organization and response latency, we identified five fields in the rat AC. However, this number is itself controversial, ranging from estimates of as low as two or three (Sally and Kelly, 1988) to as many as five (Hosokawa et al., 1998; Rutkowski et al., 2003; Kalatsky et al., 2005; Polley et al., 2007). Also, the exact locations of specific fields and their names differ among studies (for a review, see Kaas, 2011). Tonal receptive fields in core areas of the AC were characterized for two fields, the A1 and AAF (Rutkowski et al., 2003), or three fields, the A1, AAF, and PAF (Horikawa et al., 1988; Sally and Kelly, 1988; Doron et al., 2002). In this study, the criterion used to define a core area was the existence of clear tonotopic organization with respect to the fields. This is consistent with the approach of Sally and Kelly, who reported that electrophysiological recordings within the A1 described welltuned, short-latency responses that exhibited an orderly

Fig. 12. Summary of simulation results for local and global response properties in the array of the cortical columns. (A) Local response ratios (Rlocal) are illustrated in response to PGP, PNP, and PBP stimulation with different interburst intervals (inserted gap or noise length). For A1 columns 12, 13, and 14, Rlocal is the integrated ratio between the response activity to the second tone-burst and the response activity to the first tone-burst. The bin size (Dt) was set at 3 ms. See Experimental procedures for more details. (B) Similarly, global response ratios (Rglobal) are illustrated. The bin size (Dt) was 25 ms. Nonlinearity in relationships between Rglobal and the interval was larger than that between Rlocal and the interval. In (A) and (B), *, y, and à respectively represent P < 0.05 for PGP vs. PBP, PGP vs. PNP, and PNP vs. PBP by Bonferroni’s multiple comparison test.

tonotopic progression of characteristic frequency (CF) from low (1 kHz) to high (60 kHz) along a posterior-toanterior gradient (Sally and Kelly, 1988). Our results regarding rat A1 tonotopic organization also agreed with those another previous study using electrophysiological recording (Doron et al., 2002) and one that employed optical imaging (Kalatsky et al., 2005). In addition to the A1, the PAF and AAF have been mapped using microelectrode recordings (Doron et al., 2002; Rutkowski et al., 2003; Kalatsky et al., 2005;

3 Fig. 11. Simulation of evoked responses in the subcortical unit array and the cortical column array to PGP and PBP stimulation. (A) In (a), in response to 80 trials of PGP stimuli with an interburst interval (dG = 100 ms), the average energy measures (ei) over all 32 subcortical units are illustrated. The duration of tone bursts was 50 ms, and the tone frequency was 8 kHz. The BF of unit 13 was 8 kHz, so the largest changes are at unit 13 and there are smaller changes at the surrounding units during tone-burst stimuli. In (b1–4), for five interburst intervals (dG = 100, 140, 180, and 220 ms), upper panels are color (gray) map plots of activation-time histograms for 80 trials, while lower panels are the corresponding histograms that were obtained by summing count data across all A1 columns at each time point with a 1-ms bin width (i.e., Dt = 1 ms). The activation time is defined as the time point when an excitatory neural population variable (vE,j) at column j first reaches the threshold (i.e., 60% of the following peak value in the spike-like activity; see Experimental Procedures and Fig. 10Aa). B, In (a), similarly, in response to PBP stimuli with an interburst interval (dB = 100 ms), average energy measures (ei) across all 32 subcortical units are shown for 80 trials. In (b1–4), as with the planes in Ab1–4, the upper and lower panels are color (gray) map plots, and the corresponding histograms are shown for five interburst intervals (dB = 100, 140, 180, and 220 ms).

78

M. Noto et al. / Neuroscience 318 (2016) 58–83

Polley et al., 2007) and optical imaging (Kalatsky et al., 2005). The tonotopic gradient generally reverses at the posterior and anterior borders of the A1 to form boundaries with the PAF and AAF, respectively. Thus, in one approach to field classification, these three fields at most lie within the architectonically defined AC core (Doron et al., 2002) and receive input from the vMGB into the middle cortical layers (Ryugo and Killacke, 1974; Horikawa et al., 1988; Roger and Arnault, 1989; Clerici and Coleman, 1990; Romanski and LeDoux, 1993). In the AAF, our imaging characterized the tonotopic organization in an anterior-to-posterior progression from low to high frequencies, creating a reversal of the tonotopic organization found in the A1 The result was consistent with previous optical imaging studies (Kalatsky et al., 2005). In the PAF, however, our optical imaging results showed less evidence of a tonotopic gradient, as did previous electrophysiological studies (Polley et al., 2007; Pandya et al., 2008). Based on fine electrophysiological recordings, Doron et al. in particular reported that the PAF was tonotopically organized in an anterior-to-posterior progression from low to high frequencies, creating a reversal of the tonotopic organization in the A1 (Doron et al., 2002). Also, they showed that a significant number of neurons in the PAF were unresponsive to PTs but responsive to BN, so that a limited tonotopicity could only be obtained by mapping the center frequency of a band-passed noise stimulus (Doron et al., 2002). Moreover, A1 and AAF cells have short-latency responses, because we found similar latencies in optical imaging. In contrast, our results showed that the responses to PTs in the PAF had longer latencies than those in the A1 and AAF. These results agree with previous studies using electrophysiological recordings (Doron et al., 2002; Rutkowski et al., 2003) and optical imaging (Kalatsky et al., 2005). Thus, these findings may confirm that rats have at least two primary fields, the A1 and AAF; the PAF has some features of a core area, but somewhat longer response latencies and less pronounced tonotopy (Kaas, 2011). As suggested by Lee et al. (2004), the A1 and AAF in the rat are likely to function as parallel processing streams within the AC. In contrast, the rat PAF is likely to function as the second stage of a serial processing stream, with the A1 and AAF comprising the first stage. A recent optical imaging study described the existence of two additional auditory areas ventral to the A1: the VAF and AVAF (Kalatsky et al., 2005). In an electrophysiological mapping study, Polley et al. (2007) also characterized the receptive field organization within the region described as the AVAF by Kalatsky et al. (2005), and renamed it the suprarhinal auditory field. Although the VAF and AVAF appear to be tonotopically organized, this organization was not clearly observed by our optical imaging. Additionally, we found that the response latency in the VAF and AVAF was longer than that in the A1 and AAF. Thus, latency measurements in the five fields support the following understanding of auditory processing streams. Long-latency responses in belt neurons may indicate serial processing from the core to the belt fields, whereas short-latency responses may indicate parallel

processing. That is, the activity first spreads into the core fields A1 and AAF, and then propagates into two different streams, one leading to the PAF and the other to the VAF and AVAF. Thus, our results suggest that the information from the core fields spreads into the surrounding belt areas via two different pathways, as observed in other species (Horikawa et al., 2001; Tsytsarev et al., 2004). In optical imaging, voltage-sensitive dye signals are mainly the result of dendritic membrane potential changes of populations of neuronal processes (Arieli et al., 1995; Shoham et al., 1999), and they reflect synaptic input and output activity. Therefore, the dye signals represent the whole distributed cortical network that is activated by a given stimulus (e.g., PT bursts). Uncolored areas between the fields in Fig. 2f show at least weak (below our threshold level) responses to PT bursts in the frequency range between 2 and 16 kHz. These areas might respond better to other frequencies or to other sounds like noise or complex sounds. Thus, there exists some discrepancy between the results obtained in response to PT bursts from electrophysiological recording and optical imaging, which is a drawback of the present study (Sally and Kelly, 1988; Kalatsky et al., 2005). Neural activity related to auditory induction A sound that is interrupted by a period of silence is perceived as discontinuous. However, when the silence is replaced by noise, the target sound may be heard as uninterrupted. Warren et al. (1972) investigated how the frequency (spectral) content of noise affected temporal induction in humans. The current view holds that the gap induces sudden energy changes in the frequency channel of the interrupted sound (target), and these changes are therefore perceptually interpreted as target off- and on-sets (Warren and Bashford, 1999). When band-passed noise was introduced with a frequency range outside the frequency of a foreground tone, induction was less pronounced than when noise energy was present at the frequency of the tone (Petkov et al., 2003; Kubota et al., 2012). Similarly, when a noise contained a spectral notch that was centered at the frequency of the foreground tone, induction thresholds for the tone were reduced, and induction still occurred at some noise intensity (Petkov et al., 2003; Kubota et al., 2012). Therefore, sideband components centered at the tone frequency can contribute to auditory induction, because spectral integration between different frequency channels is performed across the sound spectrum. In the present study, PGP stimuli elicited no second responses in the rat core fields when the length of a silent gap was less than 100 ms. This result is consistent with previous studies on anesthetized cats with minimum early gap values of approximately 45 ms (Eggermont, 1999), on anesthetized guinea pigs with 60 ms (Kubota et al., 2012), and on awake Macaque monkeys with 26.5 ms (Petkov et al., 2003), although the minimum early gap values were different among these animal species. Moreover, because PBP stimuli were considered likely to cause auditory induction, in this study the stimuli prolonged the minimum gap length at which the second responses were elicited, and considerably reduced the

M. Noto et al. / Neuroscience 318 (2016) 58–83

second responses in amplitude in all fields. In contrast, PNP stimuli were considered likely to cause a similar effect as PGP stimuli. In practice, when band-passed noise was introduced with a frequency range outside of the frequency of a foreground tone (i.e., NN), the induction was less pronounced than when noise energy was present at the frequency of the tone. Thus, although the PNP stimuli slightly decrease the strength of auditory induction and the second responses were reduced in amplitude, they elicited responses similar in shape to the responses to PGP stimuli. To the best of our knowledge, these results related to auditory induction are the first regarding rat neural properties. The present results are also consistent with previous studies in other animal species in which population recordings of neural activity were conducted in response to sound stimuli for auditory induction (Eggermont, 1999; Petkov et al., 2003; Kubota et al., 2012). In our study, the above results revealed that second responses were suppressed by the noise-inserted sound. Also, the degree of suppression increased as the spectral overlap in frequency between the target sounds and noise increased, or as inter-burst intervals of the foreground tone burst decreased. One of the cellular mechanisms involved in suppression of successive responses by a preceding stimulus is membrane hyperpolarization (Eggermont, 2000). However, membrane hyperpolarization following depolarization was not more pronounced when noise was inserted into the silent gap than when the silent gap was applied alone (Figs. 3A, 4A). Therefore, changes in membrane potentials in AC neurons appear to be insufficient to account for the large reduction in the second responses to noise-inserted tones. Furthermore, other mechanisms of suppression such as synaptic depression (forward suppression) in (i) thalamocortical synaptic projections into the AC (Eggermont, 1999; Wehr and Zador, 2005) and/ or in (ii) intracortical excitatory connections are likely to contribute to the reduction in the second responses when noise is inserted into the silent gap. In fact, in thalamocortical synapses onto layer 4 in the mouse visual cortex, the time constants of synaptic depression were respectively around 150 and 300 ms for fast-spiking and pyramidal neurons in response to a repetitive pulse train at 10 Hz (Kloc and Maffei, 2014). In synapses from regularspiking cells to fast-spiking cells in layer 4 of the rat somatosensory cortex, the time constant of synaptic depression was around 108 ms in response to a repetitive action-potential train at 40 Hz (Beierlein et al., 2003). Also, in layer 5 pyramidal neurons from the rat somatosensory cortex, the time constant of synaptic depression was around 450 ms (Tsodyks and Markram, 1997). Wehr and Zador reported that in response to pairs of clicks (5-ms white-noise bursts) and a silent gap, although postsynaptic inhibition in the rat AC contributed to forward suppression (also known as forward masking) for only the first 50–100 ms after the first click, intracortical contributions to long-lasting (100–300 ms) suppression mainly involved synaptic depression (Wehr and Zador, 2005). Therefore, synaptic depression in thalamocortical and/or intracortical connections can profoundly

79

contribute to second response suppression induced by noise-inserted sounds in a frequency-componentdependent manner. However, the exact neuronal mechanisms underlying auditory induction are still unclear, especially regarding the relationship between auditory induction and sound frequency components at the BF of the corresponding column. Possible mechanisms underlying the reduction of the second response The AC is a complex circuit that dynamically integrates incoming thalamic signals with feedback from a recent history of activation through intracortical recurrent connections. In this study, using a simple computational model, we demonstrated that nonlinear dynamics of evoked response activity in synapses and neural populations may be responsible for auditory induction, depending on the sound spectrum. Here, with respect to auditory induction, we address how the integration of information about the stimulus spectrum from thalamocortical inputs and intracortical recurrent inputs is organized in time and space, especially across the synaptic populations extending over the tonotopic columns of cortical excitatory and inhibitory neurons. To do this, we proposed a simple computational model that includes phenomenological linear filters in the periphery and a nonlinear dynamical system in the AC. In addition, we hypothesized that suppression of the second responses induced by PBP stimuli was caused by two mechanisms: (i) synaptic depression in thalamocortical and/or intracortical excitatory connections and (ii) nonlinear dynamics of intracolumnar neural populations driven by synaptic input on the basis of sound spectral integration. In the model, the synaptic depression mechanism in the thalamocortical connections was explicitly incorporated into the model, whereas that in the intracolumnar synaptic connections in each neural population was considered to be implicitly expressed in the nonlinear dynamics of individual FHN models through w-variables. In the medial geniculate body (MGB) and A1 in unanaesthetized guinea pigs, Creutzfeldt et al. analyzed synaptically connected neurons and reported that the onset responses of A1 neurons to PTs were much more phasic and frequently transitory than those of the corresponding neurons in the MGB (Creutzfeldt et al., 1980). In addition, Philibert et al. compared the responses of auditory thalamus neurons in the guinea pig and rat to animal vocalizations, and found that the thalamic neurons of the two animal species displayed similar responses (Philibert et al., 2005). Therefore, the synaptic input into the A1 from the MGB is shaped to some extent, the characteristic responses in the A1 are thereafter formed, and A1-specific properties are enhanced and extracted through intracortical connections. For this reason, we assumed that thalamocortical projections and the population dynamics of intracortical connections in the AC indicate that this area is one of the key locations related to auditory induction. Cortical activity generally holds either a desynchronized or synchronized state. Previous studies reported

80

M. Noto et al. / Neuroscience 318 (2016) 58–83

that under anesthesia, the cortex usually operates in the synchronized state, although under some anesthetics such as urethane, desynchronized periods may occur spontaneously or be induced by electrical stimulation of some subcortical areas (Moruzzi and Magoun, 1949; Dringenberg et al., 2003). Moreover, Curto et al. reported that population responses to simple sound stimuli in the anesthetized rat AC could be approximated by a simple dynamical system model, using the FHN equations with parameter fitting from multiunit activity data for both desynchronized and synchronized states (Curto et al., 2009). Their results also indicated that collective cortical dynamics have an effective low-dimensional description for their data of auditory cortical population activity during the two states. The interpretation of their results can be easily extended to our own; that is, self-excitation arises from intracortical recurrent excitation, counterbalanced by a build-up of adaptive processes such as synaptic depression of intracortical excitatory connections and/or potassium channel activation, modeled by the wvariables in the coupled FHN model (Curto et al., 2009). Before sound stimulation, w-variables are small enough, and the stimulation triggers a rapid increase in v-variables because of self-excitation. After a while, this leads to prolonged activity, until w-variables have increased enough to damp down the network’s excitability. Furthermore, in response to the second tone-burst, w-variables of the column(s) at the BF for PT influence the corresponding m-variables according to the history of the sound components integrated over the surrounding columns during the time interval between the first and second tone-bursts. More specifically, BN between the tone-bursts (i.e., PBP stimuli) shifted the m-nullclines at the BF column(s) upward in the phase planes (Fig. 9Cd), so that the refractoriness of the response activity was more prolonged compared to PTs separated by complete silence (i.e., PGP stimuli). Thus, the simple network model including recurrent excitation and network adaptation through synaptic depression may explain some characteristic features of auditory induction in population dynamics of A1. Recent model studies also indicated that such models that include recurrent excitation and synaptic depression were successfully able to reproduce some properties in AC in response to sound stimuli (Loebel et al., 2007; Levy and Reyes, 2011). The neural network model that we presented in this study does not attempt to capture all of the details involved in auditory processing. Our model focused particularly on layers 2/3 and 4 because these are the input layers in the A1 from the thalamus, and a good starting point for our computational study. In addition, dynamic changes in the strength of synapses, such as facilitation and depression, have not been characterized well enough to quantitatively model the system in the AC (Atzori et al., 2001). Experimental studies characterizing the temporal dynamics of AC synapses will be very valuable for understanding the mechanisms of auditory induction, and will be one of our future research targets. Additional work that is based on the current experimental results should also attempt to build a more realistic model

where detailed dynamic changes in synapses are included.

CONCLUSIONS To the best of our knowledge, this is the first study to characterize neural population responses to sounds associated with auditory induction across auditory cortical fields in the rat A1. Our optical imaging results showed that first, tone-burst stimuli interrupted by a silent gap elicited early phasic responses to the first tone and similar or smaller responses to the second tone following the gap. Second, tone-burst stimuli interrupted by BN profoundly suppressed responses to the tone following the noise. Third, tone-burst stimuli interrupted by NN centered at the tone frequency partially restored the second responses from the suppression caused by BN. To explain the underlying mechanisms, we constructed a computational A1 model that included a nonlinear dynamical system consisting in part of relatively fast and slow variables, and the model was able to reproduce several of the experimentally obtained results. Our computational simulation suggests that sideband components of the sound signal centered at the tone frequency can contribute to activating the slow dynamics of the neural population at the A1 column with the BF, as well as its surroundings, so that the second responses may be suppressed by the BN-inserted stimuli through the hidden slow dynamics. Hence, understanding these mechanisms in animals can in turn offer insights into how human brains perceive sound illusions. Acknowledgments—We thank Professor Wen-Jie Song (Kumamoto University) for his kind support. In this work, T.T. was supported by the Akiyama Life Science Foundation (Japan), Nakatani Foundation (Japan), Magnetic Health Science Foundation (Japan), Tateishi Science and Technology Foundation (Japan), and a Grant-in-Aid for Scientific Research (B) (No. 15H02772) and Exploratory Research (No. 15K12091) (Japan).

REFERENCES Adesnik H, Scanziani M (2010) Lateral competition for cortical space by layer-specific horizontal circuits. Nature 464:1155–1160. Arieli A, Shoham D, Hildesheim R, Grinvald A (1995) Coherent spatiotemporal patterns of ongoing activity revealed by real-time optical imaging coupled with single-unit recording in the cat visual cortex. J Neurophysiol 73:2072–2093. Atzori M, Lei S, Evans DI, Kanold PO, Phillips-Tansey E, McIntyre O, McBain CJ (2001) Differential synaptic processing separates stationary from transient inputs to the auditory cortex. Nat Neurosci 4:1230–1237. Bashford Jr JA, Warren RM (1987) Multiple phonemic restorations follow the rules for auditory induction. Percept Psychophys 42:114–121. Beierlein M, Gibson JR, Connors BW (2003) Two dynamically distinct inhibitory networks in layer 4 of the neocortex. J Neurophysiol 90:2987–3000. Braaten RE, Leary JC (1999) Temporal induction of missing birdsong segments in European starlings. Psychol Sci 10:162–166. Calabrese A, Schumacher JW, Schneider DM, Paninski L, Woolley SM (2011) A generalized linear model for estimating

M. Noto et al. / Neuroscience 318 (2016) 58–83 spectrotemporal receptive fields from responses to natural sounds. PLoS One 6:e16104. Christianson GB, Sahani M, Linden JF (2008) The consequences of response nonlinearities for interpretation of spectrotemporal receptive fields. J Neurosci 28:446–455. Clerici WJ, Coleman JR (1990) Anatomy of the rat medial geniculatebody. 1. Cytoarchitecture, myeloarchitecture, and neocortical connectivity. J Comp Neurol 297:14–31. Cohen MA, Grossberg S, Wyse LL (1995) A spectral network model of pitch perception. J Acoust Soc Am 98:862–879. Creutzfeldt O, Hellweg FC, Schreiner C (1980) Thalamocortical transformation of responses to complex auditory stimuli. Exp Brain Res 39:87–104. Curto C, Sakata S, Marguet S, Itskov V, Harris KD (2009) A simple model of cortical dynamics explains variability and state dependence of sensory responses in urethane-anesthetized auditory cortex. J Neurosci 29:10600–10612. de la Rocha J, Marchetti C, Schiff M, Reyes AD (2008) Linking the response properties of cells in auditory cortex with network architecture: co-tuning versus lateral inhibition. J Neurosci 28 (37):9151–9163. Destexhe A, Rudolph M, Fellous JM, Sejnowski TJ (2001) Fluctuating synaptic conductances recreate in vivo-like activity in neocortical neurons. Neuroscience 107:13–24. Doron NN, Ledoux JE, Semple MN (2002) Redefining the tonotopic core of rat auditory cortex: physiological evidence for a posterior field. J Comp Neurol 453:345–360. Dringenberg HC, Vanderwolf CH, Noseworthy PA (2003) Superior colliculus stimulation enhances neocortical serotonin release and electrocorticographic activation in the urethane-anesthetized rat. Brain Res 964:31–41. Eggermont JJ (1999) Neural correlates of gap detection in three auditory cortical fields in the cat. J Neurophysiol 81:2570–2581. Eggermont JJ (2000) Neural responses in primary auditory cortex mimic psychophysical, across-frequency-channel, gap-detection thresholds. J Neurophysiol 84:1453–1463. Ermentrout GB, Terman DH (2010) Synaptic channels. In: Mathematical foundations of neuroscience. New York: Springer. p. 157–170. FitzHugh R (1961) Impulses and physiological states in theoretical models of nerve membrane. Biophys J 1:445–466. Froemke RC, Merzenich MM, Schreiner CE (2007) A synaptic memory trace for cortical receptive field plasticity. Nature 450:425–429. Glasberg BR, Moore BC (1990) Derivation of auditory filter shapes from notched-noise data. Hear Res 47:103–138. Gratton MA, Bateman K, Cannuscio JF, Saunders JC (2008) Outerand middle-ear contributions to presbycusis in the Brown Norway rat. Audiol Neurootol 13:37–52. http://dx.doi.org/10.1159/ 000107551. Grossberg S, Govindarajan KK, Wyse LL, Cohen MA (2004) ARTSTREAM: a neural network model of auditory scene analysis and source segregation. Neural Netw 17:511–536. Happel MF, Jeschke M, Ohl FW (2010) Spectral integration in primary auditory cortex attributable to temporally precise convergence of thalamocortical and intracortical input. J Neurosci 30:11114–11127. Horikawa J, Ito S, Hosokawa Y, Homma T, Murata K (1988) Tonotopic representation in the rat auditory-cortex. Proc Jpn Acad B Phys 64:260–263. Horikawa J, Hess A, Nasu M, Hosokawa Y, Scheich H, Taniguchi I (2001) Optical imaging of neural activity in multiple auditory cortical fields of guinea pigs. NeuroReport 12:3335–3339. Hoshino O (2007) Enhanced sound perception by widespread-onset neuronal responses in auditory cortex. Neural Comput 19:3310–3334. Hosokawa Y, Horikawa J, Nasu M, Sugimoto S, Taniguchi I (1998) Anisotropic neural interaction in the primary auditory cortex of guinea pigs with sound stimulation. NeuroReport 9:3421–3425. Inaoka T, Shintaku H, Nakagawa T, Kawano S, Ogita H, Sakamoto T, Hamanishi S, Wada H, Ito J (2011) Piezoelectric materials mimic

81

the function of the cochlear sensory epithelium. Proc Natl Acad Sci U S A 108:18390–18395. Izhikevich EM (2007) Neural excitability. In: Dynamical Systems in Neuroscience. Cambridge: MIT Press. Kaas JH (2011) The evolution of auditory cortex: the core areas. In: Winer Jeffery A, Schreiner Christoph E, editors. The auditory cortex. New York: Springer. p. 407–427. Kalatsky VA, Polley DB, Merzenich MM, Schreiner CE, Stryker MP (2005) Fine functional organization of auditory cortex revealed by Fourier optical imaging. Proc Natl Acad Sci U S A 102:13325–13330. Kaur S, Lazar R, Metherate R (2004) Intracortical pathways determine breadth of subthreshold frequency receptive fields in primary auditory cortex. J Neurophysiol 91:2551–2567. Kilgard MP, Pandya PK, Vazquez J, Gehi A, Schreiner CE, Merzenich MM (2001) Sensory input directs spatial and temporal plasticity in primary auditory cortex. J Neurophysiol 86:326–338. Kloc M, Maffei A (2014) Target-specific properties of thalamocortical synapses onto layer 4 of mouse primary visual cortex. J Neurosci 34:15455–15465. Kloeden PE, Platen E (1992) Numerical Solution of Stochastic Differential Equations. XXXVI, Berlin etc.: Springer-Verlag, pp 632. 85 figs., DM 118,OO. ISBN 3-540-54062-8 (Applications of Mathematics 23). Kobayasi KI, Usami A, Riquimaroux H (2012) Behavioral evidence for auditory induction in a species of rodent: Mongolian gerbil (Meriones unguiculatus). J Acoust Soc Am 132:4063–4068. Koch C (1999) Synaptic input. In: Biophysics of computation. New York: Oxford University Press. p. 85–116. Kubota M, Sugimoto S, Horikawa J (2008) Dynamic spatiotemporal inhibition in the guinea pig auditory cortex. NeuroReport 19:1691–1694. Kubota M, Miyamoto A, Hosokawa Y, Sugimoto S, Horikawa J (2012) Spatiotemporal dynamics of neural activity related to auditory induction in the core and belt fields of guinea-pig auditory cortex. NeuroReport 23:474–478. Las L, Stern EA, Nelken I (2005) Representation of tone in fluctuating maskers in the ascending auditory system. J Neurosci 25:1503–1513. Lee CC, Schreiner CE, Imaizumi K, Winer JA (2004) Tonotopic and heterotopic projection systems in physiologically defined auditory cortex. Neuroscience 128:871–887. Levy RB, Reyes AD (2011) Coexistence of lateral and co-tuned inhibitory configurations in cortical networks. PLoS Comput Biol 7: e1002161. Liu BH, Wu GK, Arbuckle R, Tao HW, Zhang LI (2007) Defining cortical frequency tuning with recurrent excitatory circuitry. Nat Neurosci 10:1594–1600. Loebel A, Nelken I, Tsodyks M (2007) Processing of sounds by population spikes in a model of primary auditory cortex. Front Neurosci 1:197–209. Meddis R, O’Mard LP, Lopez-Poveda EA (2001) A computational algorithm for computing nonlinear auditory frequency selectivity. J Acoust Soc Am 109:2852–2861. Meyer GF, Greenlee M, Wuerger S (2011) Interactions between auditory and visual semantic stimulus classes: evidence for common processing networks for speech and body actions. J Cogn Neurosci 23:2291–2308. Micheyl C, Carlyon RP, Shtyrov Y, Hauk O, Dodson T, Pullvermuller F (2003) The neurophysiological basis of the auditory continuity illusion: a mismatch negativity study. J Cogn Neurosci 15:747–758. Miller CT, Dibble E, Hauser MD (2001) Amodal completion of acoustic signals by a nonhuman primate. Nat Neurosci 4:783–784. Moore BC, Glasberg BR (1983) Suggested formulae for calculating auditory-filter bandwidths and excitation patterns. J Acoust Soc Am 74:750–753. Moruzzi G, Magoun HW (1949) Brain stem reticular formation and activation of the EEG. Electroencephalogr Clin Neurophysiol 1:455–473. Nagumo J, Arimoto S, Yoshizawa S (1962) Active pulse transmission line simulating nerve axon. Proc IRE 50:2061.

82

M. Noto et al. / Neuroscience 318 (2016) 58–83

Oviedo HV, Bureau I, Svoboda K, Zador AM (2010) The functional asymmetry of auditory cortex is reflected in the organization of local cortical circuits. Nat Neurosci 13:1413–1420. Pandya PK, Rathbun DL, Moucha R, Engineer ND, Kilgard MP (2008) Spectral and temporal processing in rat posterior auditory cortex. Cereb Cortex 18:301–314. Patterson RD, Robinson K, Holdsworth J, McKeown D, Zhang C, Allerhand MH (1992) Complex sounds and auditory images. In: Cazals Y et al., editors. Auditory physiology and perception. Oxford: Pergamon. p. 429–446. Pasquale V, Massobrio P, Bologna LL, Chiappalone M, Martinoia S (2008) Self-organization and neuronal avalanches in networks of dissociated cortical neurons. Neuroscience 153:1354–1369. http://dx.doi.org/10.1016/j.neuroscience.2008.03.050. Petkov CI, O’Connor KN, Sutter ML (2003) Illusory sound perception in macaque monkeys. J Neurosci 23:9155–9161. Petkov CI, O’Connor KN, Sutter ML (2007) Encoding of illusory continuity in primary auditory cortex. Neuron 54:153–165. Philibert B, Laudanski J, Edeline JM (2005) Auditory thalamus responses to guinea-pig vocalizations: a comparison between rat and guinea-pig. Hear Res 209:97–103. Polley DB, Read HL, Storace DA, Merzenich MM (2007) Multiparametric auditory receptive field organization across five cortical fields in the albino rat. J Neurophysiol 97:3621–3638. Repp BH, Lin HB (1991) Effects of preceding context on the voiceonset-time category boundary. J Exp Psychol Hum Percept Perform 17:289–302. Riecke L, van Opstal AJ, Goebel R, Formisano E (2007) Hearing illusory sounds in noise: sensory-perceptual transformations in primary auditory cortex. J Neurosci 27:12684–12689. Roger M, Arnault P (1989) Anatomical study of the connections of the primary auditory area in the rat. J Comp Neurol 287:339–356. Romanski LM, LeDoux JE (1993) Organization of rodent auditory cortex: anterograde transport of PHA-L from MGv to temporal neocortex. Cereb Cortex 3:499–514. Rutkowski RG, Miasnikov AA, Weinberger NM (2003) Characterisation of multiple physiological fields within the anatomical core of rat auditory cortex. Hear Res 181:116–130. Ryugo DK, Killacke HP (1974) Differential telencephalic projections of medial and ventral divisions of medial geniculate-body of rat. Brain Res 82:173–177. Sally SL, Kelly JB (1988) Organization of auditory cortex in the albino rat: sound frequency. J Neurophysiol 59:1627–1638. Schreiner CE, Read HL, Sutter ML (2000) Modular organization of frequency integration in primary auditory cortex. Annu Rev Neurosci 23:501–529. Sharpee TO (2013) Computational identification of receptive fields. Annu Rev Neurosci. 36:103–120. Sharpee TO, Miller KD, Stryker MP (2008) On the importance of static nonlinearity in estimating spatiotemporal neural filters with natural stimuli. J Neurophysiol 99:2496–2509. Shoham D, Glaser DE, Arieli A, Kenet T, Wijnbergen C, Toledo Y, Hildesheim R, Grinvald A (1999) Imaging cortical dynamics at high spatial and temporal resolution with novel blue voltagesensitive dyes. Neuron 24:791–802. Song WJ, Kawaguchi H, Totoki S, Inoue Y, Katura T, Maeda S, Inagaki S, Shirasawa H, Nishimura M (2006) Cortical intrinsic circuits can support activity propagation through an isofrequency strip of the guinea pig primary auditory cortex. Cereb Cortex 16:718–729. Sugita Y (1997) Neuronal correlates of auditory induction in the cat cortex. NeuroReport 8:1155–1159. Tan AY, Wehr M (2009) Balanced tone-evoked synaptic excitation and inhibition in mouse auditory cortex. Neuroscience 163:1302–1315. Tateno T, Harsch A, Robinson HP (2004) Threshold firing frequencycurrent relationships of neurons in rat somatosensory cortex: type 1 and type 2 dynamics. J Neurophysiol 92:2283–2294. Tateno T, Nishikawa J, Tsuchioka N, Shintaku H, Kawano S (2013) A hardware model of the auditory periphery to transduce acoustic signals into neural activity. Front Neuroeng 6:12.

Theunissen FE, Elie JE (2014) Neural processing of natural sounds. Nat Rev Neurosci 15:355–366. Tsodyks MV, Markram H (1997) The neural code between neocortical pyramidal neurons depends on neurotransmitter release probability. Proc Natl Acad Sci U S A 94:719–723. Tsytsarev V, Yamazaki T, Ribot J, Tanaka S (2004) Sound frequency representation in cat auditory cortex. NeuroImage 23:1246–1255. Verbny YI, Merriam EB, Banks MI (2005) Modulation of gammaaminobutyric acid type A receptor-mediated spontaneous inhibitory postsynaptic currents in auditory cortex by midazolam and isoflurane. Anesthesiology 102(5):962–969. Warren RM, Bashford Jr JA (1999) Intelligibility of 1/3-octave speech: greater contribution of frequencies outside than inside the nominal passband. J Acoust Soc Am 106:L47–L52. Warren RM, Obusek CJ, Ackroff JM (1972) Auditory induction: perceptual synthesis of absent sounds. Science 176:1149–1151. Wehr M, Zador AM (2003) Balanced inhibition underlies tuning and sharpens spike timing in auditory cortex. Nature 426:442–446. Wehr M, Zador AM (2005) Synaptic mechanisms of forward suppression in rat auditory cortex. Neuron 47:437–445. Winer JA, Miller LM, Lee CC, Schreiner CE (2005) Auditory thalamocortical transformation: structure and function. Trends Neurosci 28:255–263. Wrigley SN, Brown GJ (2004) A computational model of auditory selective attention. IEEE Trans Neural Netw 15:1151–1163. Wu GK, Arbuckle R, Liu BH, Tao HW, Zhang LI (2008) Lateral sharpening of cortical frequency tuning by approximately balanced inhibition. Neuron 58:132–143. Xu X, Roby KD, Callaway EM (2010) Immunochemical characterization of inhibitory mouse cortical neurons: three chemically distinct classes of inhibitory cells. J Comp Neurol 518:389–404. Zhang LI, Tan AY, Schreiner CE, Merzenich MM (2003) Topography and synaptic shaping of direction selectivity in primary auditory cortex. Nature 424:201–205. Zhang LI, Zhou Y, Tao HW (2011) Perspectives on: information and coding in mammalian sensory physiology: inhibitory synaptic mechanisms underlying functional diversity in auditory cortex. J Gen Physiol 138:311–320.

APPENDIX A A.1. A general concept for the computational model of the auditory pathway In this study, to mimic the activity of the rat AC, the spatial pitch network (SPINET) model (Cohen et al., 1995; Grossberg et al., 2004) was chosen, and some parts of the model was used as a preprocessing unit of the peripheral and subcortical functions. The SPINET model was originally developed to simulate psychophysical data concerning how the brain converts sound streams into frequency spectra that activate spatial representations of pitch, and it can successfully simulate psychophysical phenomena including the auditory (or temporal) induction. However, because the SPINET model cannot reproduce the nonlinear phenomena observed in the optical imaging reported here, we added an excitable dynamical system with fast and slow variables as final processing units in the AC. However, there are many families of excitable system models (Izhikevich, 2007), in which two stable equilibrium points at least are needed in the phase plane. Here, we have chosen the FHN equations (FitzHugh, 1961), because they are of simple form and sufficiently flexible to allow a wide range of nonlinear dynamics. Therefore, our computational model is composed of three main stages: peripheral, subcortical, and cortical stages.

83

M. Noto et al. / Neuroscience 318 (2016) 58–83

The first stage of the model simulates peripheral auditory processing including a cochlear filter bank. The second stage of the model calculates energy measure from the filter bank output on the basis of the short-time window spectra, and the model extracts and integrates output of different channels in the filter bank. The third stage consists of a layer of one-dimensional array, which has an excitable system in each cortical column. In the following, we describe some details of our model from the periphery to central auditory systems (also see Experimental procedures).

ef ðtÞ ¼

Dt XW=Dt jgf ðt  kDtÞj2 eaDtk ; k¼0 W

ðA6Þ

where ef(t) is the energy measure in output gf(t) of the GF centered at frequency f at time t, W is the time-window length over which the energy measure is computed; and a represents the decay of the exponential window. In the simulations, we used the following values: a = 0.995 and W = 5 ms (Grossberg et al., 2004). The output of the energy measure feeds into the cortical layer of nonlinear dynamical systems after compression, extraction, and summation (see text).

A.2. Peripheral auditory system The outer and middle ears can act as a broad bandpass filter, linearly boosting frequencies between 100 and 20,000 Hz. The model is based on the reported result that the frequency transfer functions in rat outer and middle ears approximately have a slope of 6.0 dB per octave in a rage of 1.0–10 kHz and that single peaks are located at around 20 kHz in the functions (Gratton et al., 2008). An approximation to this is to preemphasize the signal using a simple difference equation: yðtÞ ¼ xðtÞ  Axðt  DtÞ;

ðA1Þ

where A is the reemphasis parameter, and Dt is the sampling interval (Grossberg et al., 2004). In the simulations, A was set to 0.95 and Dt = 10 or 25 ls. The frequency selectivity of the basilar membrane is modeled by a bank of 32 GFs (Patterson et al., 1992) distributed in frequency between 1.0 and 32 kHz on the ERB scale (Glasberg and Moore, 1990). Each filter simulates the linear response of the basilar membrane at a specific position along its length. The GF of 4th order and a center frequency f0 in Hz is given as gf0 ¼ t expð2pbðf0 ÞtÞ cosð2pf0 t þ uÞHðtÞ 3

ðA2Þ

where u is phase, H(t) is the unit step (Heaviside) function, and b(f0) is related to the bandwidth parameter described as, bðf0 Þ ¼ 1:02 ERBðf0 Þ:

ðA3Þ

The ERB of a GF is the equivalent bandwidth that a rectangular filter would have if it passed the same power: ERBðfÞ ¼ 6:23  106 f2 þ 93:39  103 f þ 28:52

ðA4Þ

In addition, the frequency response of the GF is described as Gf0 ðfÞ ¼ ½1 þ jðf  f0 Þ=b

4

ðA5Þ

Hence, the output of each GF can be converted into energy measure both in the time and frequency domains. A.3. Energy measure The output of each GF is converted into energy measure, which calculates energy spectra during a short-timewindow (Cohen et al., 1995):

A.4. Compression, extraction, and summation before processing in the cortical layer From the subcortical layer to the cortical layer, we considered thalamocortical projections, in which the signals are compressed, extracted, and integrated. In addition, in intracortical connections (both intra- and inter-columnar connections) of the AC, the signals are recurrently processed. Before processing in the final AC layer of a column array consisting of nonlinear dynamical systems, output signals of all channels are compressed as the proposed model used in Meddis et al. (2001). The Meddis et al. model was based on the data for the response of the chinchilla basilar membrane in response to 10-kHz (BF) tone burst. To match the compressive function to our model system, we adjusted model parameters. The compressive function was described as  sðxÞ ¼

0:002x

for x 6 35:2 dB;

0:15x0:25  0:298 for x > 35:2 dB;

ðA7Þ

where x is the energy measure output. The function is linear at low signal levels, whereas it is nonlinear at higher signal levels. A.5. Synaptic fluctuation model Synaptic fluctuation as background noise is modeled as Ornstein–Uhlenbeck processes with zero means (Destexhe et al., 2001) by the last terms on the righthand sides of Eqs. (4a) and (4b), where gE and gI are noise intensities and nE ðtÞ and nI ðtÞ are both Ornstein– Uhlenbeck processes with zero means (Destexhe et al., 2001). That is, the Ornstein–Uhlenbeck noise is described as dni ðtÞ=dt ¼ ni ðtÞ=sOU þ fðtÞ;

ðA8Þ

where the subscript i = E or I, sOU is a time constant, and fðtÞ is the standard white Gaussian noise. Unless otherwise described, all default parameter values listed in Tables 1–3 were used in our simulation. All numerical simulations in the model was performed by the Euler– Murayama method or the Heun method with a fixed time-step of 0.1 or 0.05 ms (Kloeden and Platen, 1992). The same results were obtained by the two numerical methods.

(Accepted 31 December 2015) (Available online 7 January 2016)