Response characteristics of primary auditory cortex neurons underlying perceptual asymmetry of ramped and damped sounds

Response characteristics of primary auditory cortex neurons underlying perceptual asymmetry of ramped and damped sounds

Neuroscience 256 (2014) 309–321 RESPONSE CHARACTERISTICS OF PRIMARY AUDITORY CORTEX NEURONS UNDERLYING PERCEPTUAL ASYMMETRY OF RAMPED AND DAMPED SOUN...

2MB Sizes 0 Downloads 48 Views

Neuroscience 256 (2014) 309–321

RESPONSE CHARACTERISTICS OF PRIMARY AUDITORY CORTEX NEURONS UNDERLYING PERCEPTUAL ASYMMETRY OF RAMPED AND DAMPED SOUNDS J. WANG, a L. QIN, a,b S. CHIMOTO, a S. TAZUNOKI a AND Y. SATO a*

Key words: auditory cortex, response asymmetry, response duration, sound envelope, integration time, perceptual asymmetry.

a Department of Physiology, Interdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi, Chuo, Yamanashi 409-3898, Japan b Department of Physiology, China Medical University, Shenyang 110001, People’s Republic of China

INTRODUCTION Because all sounds begin and end at some time point, amplitude must increase (referred to as attack) at the onset and decrease (decay) at the offset. Patterson (1994a,b) called sound with an instantaneous attack followed by an exponential slow decay damped sound, while the time-reversed sound with an exponential slow attack followed by an instantaneous decay was called ramped sound. Ramped and damped sounds were shown to differ in terms of perception of timbre, loudness, and subjective duration: a ramped tone is stronger in tonal timbre (Patterson, 1994a,b), louder in strength (Irino and Patterson, 1996), and longer in subjective duration (Schlauch et al., 2001) than a damped tone, even if the two tones are equal in physical duration and intensity. Attack is primarily determined by forces that drive an acoustic medium to vibrate: plucking the acoustic medium produces a sound with a quick attack, while bowing the medium produces a sound with a slow attack (Cutting and Rosner, 1974; Rosen and Howel, 1981; Cutting, 1982). Decay, on the other hand, is determined by multiple factors including both physical characteristics of the acoustic medium (Lutfi and Stoelinga, 2010) and the reverberant environment. Stecker and Hafter (2000) proposed that a listener has the ability to parse a sound to recover information about the physical characteristics of the acoustic medium and the reverberant environment. The asymmetrical perception in loudness was interpreted as the phenomenon of perceptual constancy related to the parsing of auditory input into direct and reverberant sound (Stecker and Hafter, 2000). The listener eliminates the slow decay portion from the judgment of loudness as a sense of reverberation, resulting in soft perception for damped sound. For duration matching experiments of ramped and damped sounds in humans, Digiovanni and Schlauch (2007) instructed one group of participants to simply match the duration and the other group to includes all aspects of the sounds: the former yielded longer subjective duration for ramped sounds than damped sounds, by contrast, the latter significantly reduced the size of the asymmetry in subjective duration. They

Abstract—Sound envelope plays a crucial role in perception: ramped sounds (slow attack and quick decay) are louder in strength and longer in subjective duration than damped sounds (quick attack and slow decay) even if they are equal in intensity and physical duration. To explain the asymmetrical perception, the perceptual constancy hypothesis supposes that the listener eliminates the slow decay of damped sounds from the judgment of perception, while the persistence of perception hypothesis supposes asymmetrical neural responses after the source has stopped. To understand neural mechanisms underlying the perceptual asymmetry, we explored response properties of the primary auditory cortex (A1) neurons during ramped and damped stimuli in awake cats. We found two distinct types of cells tuned to specific features of the sound envelope: edge cells sensitive to the temporal edge, such as quick attack and decay, while slope cells sensitive to slow attack and decay. The former needs a short (<2.5 ms) period of stimulus duration for evoking maximal peak responses, while the latter needs a long (20 ms) period, suggesting that the timescale of processing underlies differential sensitivity between the cell types. The findings suggest that perceptual constancy is not yet be executed at A1 because the specific cells distinguishing the direction of amplitude change (attack or decay) are lacking in A1. On the other hand, there is evidence of persistence of perception: overall response duration during ramped sound reached 1.4 times longer than that during damped sound, originating mainly from the response asymmetry of the edge cell (sensitive to the quick decay of ramped sounds but not to the slow decay of damped sounds), and neuronal persistence of excitation after the termination of ramped sounds was substantially longer than that of damped sounds, corresponding to the psychological evidence of persistence of perception. Ó 2013 IBRO. Published by Elsevier Ltd. All rights reserved.

*Corresponding author. Tel: +81-55-273-9482; fax: +81-55-2736731. E-mail address: [email protected] (Y. Sato). Abbreviations: A1, primary auditory cortex; BF, best frequency; PSTHs, peri-stimulus time histograms; RDI, response duration index; SCI, spike count index; SPL, sound pressure level.

0306-4522/13 $36.00 Ó 2013 IBRO. Published by Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.neuroscience.2013.10.042 309

310

J. Wang et al. / Neuroscience 256 (2014) 309–321

suggested that the reducing size of the asymmetry is related to the perceptual constancy and the remaining perceptual difference is due to a sensory process of persistence of perception, which is thought of as ‘‘the continuation of a sound’s internal representation in the auditory nervous system after the source has stopped’’ (Digiovanni and Schlauch, 2007). They showed, by temporal masking experiments in humans, the asymmetrical time course of the internal representation of the ramped and damped sounds (Digiovanni and Schlauch, 2007). Ries et al. (2008) showed that the persistence of perception hypothesis explained the loudness matching data. Thus, it is well known that sound envelope plays a crucial role in perception. Nevertheless, little is known about the neural response characteristics during ramped and damped stimuli. We investigated (1) response characteristics of primary auditory cortex (A1) neurons of awake cats during ramped and damped stimuli, and (2) whether or not the response characteristics support hypotheses of perceptual constancy and persistence of perception.

EXPERIMENTAL PROCEDURES Experiments were performed in accordance with the Guidelines for Animal Experiments, University of Yamanashi, and the Guiding Principles for the Care and Use of Animals approved by the Council of the Physiological Society of Japan. Animal preparation, recording, and histology Three cats were chronically prepared for single-unit recordings from both hemispheres of the auditory cortex in a manner similar to that described previously (Qin et al., 2007, 2008a,b, 2009). Under pentobarbital sodium anesthesia (initial dose 40 mg/kg) and aseptic conditions, cats had an aluminum cylinder (inner diameter 12 mm) implanted into the bilateral temporal bone for microelectrode access, at an angle of 10–20° from the sagittal plane. A metal block was embedded in the dental acrylic cap to immobilize the head. After at least 1 week of postoperative recovery, the cats were acclimated to the experimental conditions. Each cat’s body was gently wrapped in a cloth bag and the head was restrained with holding bars for several minutes. In successive daily sessions, the period was lengthened, and they were familiarized to sitting in an electrically shielded, sound-attenuated chamber. The animals were given food and water during the sessions. The conditioning procedure lasted at least 2 weeks. When recording experiments began, they sat with no sign of discomfort or restlessness. One day before the recording session, the bone (diameter 1–2 mm) at the bottom of the cylinder was removed, leaving the dura intact under ketamine anesthesia (initial dose 15 mg/kg). The recording session began the following day. The dura was pierced with a sharpened probe, and an epoxylite-insulated tungsten microelectrode (impedance: 2–5 MO at 1 kHz; FHC Inc.) was advanced into the A1 with a remote-controlled micromanipulator (MO-951;

Narishige). Extracellular single-unit activities were recorded and discriminated using a template-matching discriminator (ASD, Alpha-Omega Engineering). Search stimuli were tone bursts of variable single frequency and sound pressure level (SPL, in dB re: 20 lPa). The cat’s face, particularly the eyes, was continuously observed on a monitor connected to a camera in front of the cat. Saccadic eye movements and eye fixation were judged as signs of an awake state (Chimoto et al., 2002). Rapid eye movements of paradoxical sleep were easily identified by their characteristic appearance of half-opened eyelids, and were judged as signs of sleep (Chimoto et al., 2002). When drowsiness was suspected, the cat was alerted by gently tapping the body with a remote-controlled tapping tool, or by briefly opening and closing the door. The cats sometimes moved during recording sessions, producing artifacts in the recording. By carefully checking the monitors and the spike train, artifacts were marked in the computer in real time while recordings were in progress. Data with artifacts could therefore be excluded. Daily recording sessions lasted 3–5 h for 2–6 months for each animal. At the end of each session, the recording chamber was rinsed with sterile saline and antibiotic fluid (sulfamethoxazole, Taisho Pharmaceutical Co, Saitama, Japan), and sealed with Exafine (GC Corporation, Tokyo, Japan) and an aluminum cap. The animal was returned to its cage. The animals remained healthy throughout the experimental period. At the termination of the experiment, some of the recording sites were marked with electrolytic lesions by passing cathodal current (25 lA, 10 s). Each animal was deeply anesthetized with sodium pentobarbital and perfused with 10% formalin before the brain was removed. The brain surface was photographed. The cerebral cortex was cut into transverse sections and stained with neutral red. Each section was captured using a scanner. On the basis of the lesion locations and electrode tracks, the recording sites were reconstructed on the image of the section. Sound delivery Sound delivery system was controlled by a custom-made program written in MATLAB (MathWorks). Digitally generated waveforms of sound stimuli were fed into a 16-bit digital-to-analog converter (PCI-6052E; National Instruments, Austin, USA) at a sampling frequency of 100 kHz and to an 8-pole Chebyshev filter (P-86; NF Electric Instruments, Yokohama, Japan) with a high cutoff frequency of 20 kHz. The outputs were sent to a lowoutput-impedance power amplifier (PMA2000III; Denon, Kawasaki, Japan) and played through a speaker (K1000; AKG, New York, USA) placed 2 cm away from the auricle contralateral to the recording site. We calibrated the sound delivery system between 128 and 16,000 Hz at frequency steps of 8 Hz, and the output varied by ±5 dB. Harmonic distortion was less than 60 dB. Stimulus paradigm and data analysis While recording a single neuron, we first presented tone burst (5 ms in rise/fall time; 100 ms in stimulus duration)

J. Wang et al. / Neuroscience 256 (2014) 309–321

with various frequencies (125 steps in the range 128– 16,000 Hz) at 50-dB SPL. Spike activities were analyzed using user-written programs in a MATLAB (Mathworks) environment. The driven rate (firing rate during a given period minus the spontaneous firing rate during the prestimulus period) was calculated for each stimulus frequency to construct the function of the iso-intensity frequency response. By comparing heights of the frequency response functions (128 Hz in resolution), we defined the best frequency (BF) as the frequency producing the maximum response amplitude. For damped stimulus with instantaneous attack and slow decay, the amplitude of the sound wave exponentially decreases with a time constant set to 1/5th of the stimulus duration and is given by x(t) = e(5t/T)w(t), where w is the rectangular gated sinusoid, t is time in seconds, and T is the duration of w in seconds (Schlauch et al., 2001; Ries et al., 2008). The ramped stimulus with slow attack and instantaneous decay is just the time-reversed version of the damped stimulus. The carrier frequency is at the cell’s BF. We presented 20 repetitions of ramped and damped stimuli in six different durations (2.5, 5, 10, 20, 40, and 80 ms). In total, 240 stimuli (2 types  6 durations  20 repetitions) were randomly presented in one session. The peak amplitude of ramped and damped stimuli was 50-dB SPL (trough amplitude 6.6 dB SPL) unless noted otherwise. In this study, BF was in the range of 800–16,000 Hz, and the recorded cells with BF < 800 Hz were discarded to ensure at least two cycles of sinusoids during a 2.5-ms stimulus period. In some neurons we presented 20 repetitions of ramped and damped stimuli in three different amplitudes (30, 50, and 70-dB SPL) at 80-ms duration. We adopted 80-ms duration because neural response patterns were extensively explored at 80-ms duration. Spike trains responding to ramped and damped stimuli (20 repetitions) were used to construct peristimulus time histograms in a 1-ms time bin (PSTHs) smoothed by Gaussian function with 5-ms SD. The height of PSTH was transformed into the driven rate by subtracting the mean of background firing rates (firing rate calculated from the 0.5-s period before each stimulus onset). The numerical data were subjected to statistical analyses using R version 2.14.2.

RESULTS Cell types based on neural response pattern We recorded 182 single-unit spike activities responding to ramped and damped stimuli in A1 of awake cats. Stimulus duration was systematically shifted at stimulus amplitude of 50-dB SPL in 137 cells, and stimulus amplitude was shifted at stimulus duration of 80 ms in the remaining 45 cells. Fig. 1 shows neural response patterns (driven rates >3SD of the background firing rates) of the 137 cells during ramped and damped stimuli of 80 ms in duration and 50-dB SPL, where the post-stimulus time periods were divided into onset (0–40 ms from the stimulus onset), middle (40–80 ms), and offset (80–120 ms)

311

periods. It is clearly shown in this Fig. 1 that the response pattern is heterogeneous but not homogeneous. Individual cells have a response preference to the specific periods. A majority of cells preferred offset period after ramped stimuli and onset period of damped stimuli (quick decay and quick attack) avoiding the middle period of ramped and damped stimuli (slow attack and slow decay), and some cells also preferred onset period of ramped stimuli and offset period of damped stimuli in addition to the offset period of ramped stimuli and onset period of damped stimuli (Fig. 1, cells in upper region). Other cells preferred ongoing stimulus periods (onset and middle periods) of ramped and damped stimuli avoiding the offset period (Fig. 1, intermediate region). The remaining cells have a preference to all three periods when adding the response periods during ramped and damped stimuli (Fig. 1, bottom region). Thus, based on the combination of the responsive periods after adding the responsive periods during ramped and damped stimuli, we classified cells into three types: edge cells (cell No. 1-73) responding during the onset and offset periods but not during the middle period, slope cells (74-103) responding during the on-going stimuli (onset and middle periods) but not during the offset period, and edge-slope cells (104-137) responding throughout the three periods. Neuronal responses just decreasing during the middle period were considered as originating from the onset period and the just decreasing period was ignored. Similarly, neuronal responses just decreasing during the offset period were considered as originating from the middle period and the just decreasing period was ignored. Short responsive periods (<7 successive bins) surrounded by unresponsive periods were ignored. Fig. 2A–D show some variation of response patterns of individual edge cells in the form of PSTH: pattern 1 (A, n = 42) responding during the offset period of ramped stimuli and the onset period of damped stimuli (two-periods response); pattern 2 (B, n = 11) responding during the onset period of ramped and damped stimuli (two-periods response); pattern 3 (C, n = 14) responding during the onset period of ramped stimuli in addition to the pattern 1 response (threeperiods response); pattern 4 (D, n = 6) responding during onset and offset periods of ramped and damped stimuli (four-periods response). Response time courses of individual slope cell and edge-slope cell are also shown (Fig. 2E, F), respectively, where the response is during the onset and middle periods of ramped and damped stimuli in the slope cell (E) and throughout the three periods in the edge-slope cell (F). We then investigated whether or not the response-type classification (edge, slope, edge-slope cells) is maintained by shifting the stimulus amplitude ± 20 dB from the standard 50-dB SPL in the 45 cells. The majority (eight of eleven) of pattern 1 of edge cells changed to pattern 3 (Fig. 3A) or pattern 4 by increasing the intensity. Many (four of ten) of the pattern 3 of edge cells changed to pattern 4 (Table 1). On the other hand, many (four of ten) of the patttern 3 of edge cells changed to pattern 1 by decreasing the stimulus intensity (Fig. 3B). Thus, in general, an increase in the stimulus intensity tends to

312

J. Wang et al. / Neuroscience 256 (2014) 309–321

5DPS Onset

'DPS Offset

Middle

Middle

Onset

Offset

Edge



&HOO1R





Slope



Edge-slope















7LPH PV



7LPH PV

Fig. 1. Neural response patterns of 137 cells during ramped (left column) and damped (right) stimuli. Sound envelopes (80 ms in duration) are shown at the top. Horizontal bars show responsive periods (driven rates >3SD of the background firing rates). Vertical bars are boundaries of the post-stimulus time periods (onset, middle, and offset periods). Based on the response pattern, individual cells were divided into the edge, slope, and edge-slope cells.

A

D

Pattern 1 edge cell

Driven rate (spikes/s)

0

B

Pattern 4 edge cell

200

150

0

E

Pattern 2 edge cell

Slope cell

100

60

0

C

0

F

Pattern 3 edge cell

400

Edge-slope cell

200

0 -0.1

0 0.1 0 Time(s)

0.2 -0.1

0.1 0 Time(s)

0.2

-0.1

0

0.1

Time(s)

0.2 -0.1

0.1 0 Time(s)

0.2

Fig. 2. PSTH (bin width, 1 ms) of representative individual cells of edge (A–D), slope (E), and edge-slope (F) types during ramped (left column) and damped (right) stimuli. Sound envelopes (80 ms in duration) are shown at the top. Stimulus periods are shown as shaded areas. Horizontal dotted lines show 2SD of the background firing rate.

increase the number of response periods (rightward arrows in Table 1) during the onset and offset periods,

while a decrease in the intensity tends to decrease the number of response periods (leftward arrows) during the

J. Wang et al. / Neuroscience 256 (2014) 309–321

Edge cell 0.1s

100 spikes/s

A

313

C

Pattern 3 Pattern 1

Slope cell

D

B Edge cell

Pattern 2

+20dB

Pattern 3

0

Pattern 1

-20

Edge-slope cell

Fig. 3. PSTH of individual cells during change of stimulus intensities (50 dB SPL ± 20 dB). Note that the amplitude shift does not change the cell’s response type. See Fig. 2 for other captions.

Table 1. Change of response patterns of edge cells by change of sound intensity

The order of the response patterns was purposefully arranged by the number of response periods, so rightward arrows indicated an increase in response period number while leftward arrows indicated decreases in response period number. For example five pattern 1 of edge cells changed to pattern 3 by an increase in the stimulus intensity of 20 dB, and the number of the response periods increased (rightward arrow) from 2 to 3 during ramped and damped stimuli. Note that the response type was maintained by change of sound intensity.

onset and offset periods in the edge cells. It is important to note that the responsive period changes only between the onset and offset periods avoiding the middle period: response type of the edge cell (defined by responses during the onset and offset periods but not during the middle period) was maintained even after the amplitude change. For the slope cells (n = 13), the response phase changed within the onset and middle periods eliminating the offset period. For example, an increase in the stimulus amplitude resulted in shorter peak-response latency for ramped stimuli and longer latency for damped stimuli in an example middle cell (Fig. 3C). The response type of the slope cell (defined by responses during the onset and middle periods but not during the offset period) was, thus, maintained even after the amplitude change. For the slope-edge cells (n = 6), response type

was also maintained (Table 1) and the results were intermediate between the edge cells and the slope cells: response amplitude changed during the onset and offset periods like the edge cells and the response phase changed in the onset and middle periods like the slope cells (Fig. 3D). It is concluded that our classification of cell types is independent of the stimulus intensity in the range of 30–70 dB SPL. Averaged population responses of the edge, the slope, and the edge-slope cells are shown in Fig. 4A–C, respectively. The edge cell has a strong responsepreference for the quick decay and the quick attack (offset period of ramped sounds and onset period of damped sounds), while it has a relatively weak preference for the slow onset and the slow decay (onset and middle periods of ramped stimuli and middle and

314

J. Wang et al. / Neuroscience 256 (2014) 309–321

Driven rate (spikes/s)

A edge

100

0 60

0 100

0 60

0 100

0 60

0 100

0 60

0 100

0 60

0 100

0 -0.05 0

B slope

Driven rate (spikes/s)

C edge-slope

n=73

60

0.1

-0.05 0

0.1

0 -0.05 0

D overall

n=30

60

60

0 60

0 60

0 60

0 60

0 60

0 60

0 60

0 60

0 60

0 60

0 -0.05 0

0.1 Times (s)

0.1

-0.05 0

0.1 Times (s)

0 0.1 -0.05 0 Times (s)

n=34

-0.05 0

0.1

n=137

-0.05 0

0.1 Times (s)

Fig. 4. Shift of stimulus durations (80, 40, 20, 10, 5, and 2.5 ms) resulting in change of the population response-time courses during ramped (left column) and damped (right) stimuli in the three indicated cell types (A–C) and overall mean responses (D). Mean (solid line) and standard error (dotted line) of PSTHs are shown. Sound envelopes (80 ms in duration) are shown at the top.

offset periods of damped stimuli), resulting in strong response asymmetry during ramped and damped stimuli (Fig. 4A, top traces). In contrast, the slope cell has a response-preference for the on-going stimuli during onset and middle periods, resulting in less strong asymmetry during ramped and damped stimuli (Fig. 4B, top traces). Finally, the edge-slope cell has a response period throughout the onset, middle, and offset periods (Fig. 4C, top traces). These results suggest that A1 includes at least two types of functionally specific cells: the edge cell is sensitive to the quick attack and the quick decay, while the slope cell is sensitive to the ongoing stimuli (slow attack during ramped stimuli and the slow decay during damped stimuli) at moderate stimulus amplitude. Time window for evoking spike responses To understand the neural mechanisms underlying the differential response types mentioned just above, we then investigated the time window for evoking spike responses. For this purpose, we systematically shifted the stimulus duration from 80 to 2.5 ms (Fig. 4A–C) in

the 137 cells. In the edge cells (n = 73), peak response height was maintained in spite of the shortening (Fig. 4A); in contrast, in the slope cells (n = 30), the peak response height was substantially decreased by the shortening (Fig. 4B). In the remaining edge-slope cells (n = 34), the results were intermediate between the edge cells and the slope cells: neural responses decreased a little by the shortening (Fig. 4C). These qualitative observations were quantitatively analyzed: the peak-driven rate of an individual cell was calculated and the average was plotted against the stimulus duration (Fig. 5). For statistical evaluation, the data were fit to a logistic regression model: y¼

a ; 1 þ becx

and a linear regression model: y = d  x + f, where y is the driven rate and x is the stimulus duration. Coefficients (a, b, c, d, f), 95% confidence interval, and Pvalue were calculated and are shown in Table 2. In the edge cells, because the best fit of the linear regression for the slope coefficient d is near 0 and the 95% confidence interval of the slope coefficient d

315

J. Wang et al. / Neuroscience 256 (2014) 309–321

A

C

E

B

D

F

Fig. 5. Shift of stimulus durations resulting in change of peak response amplitude in the slope cells (C, D) and the edge-slope cells (E, F), but not in the edge cells (A, B), during ramped (A, C, E) and damped (B, D, F) stimuli. Mean of peak-driven rate (open circle) and standard error (vertical bar) is plotted against sound duration. Logistic regression line and its equation are shown.

includes 0 (Table 2), linear model was not significant (Fig. 5A, B, P > 0.05), indicating that the peak-driven rate is independent of the stimulus duration in the range of 2.5–80 ms. Because the peak-driven rate does not decrease by decreasing the stimulus duration to 2.5 ms, the minimum integration time window of the edge cells for evoking peak responses is shorter than 2.5 ms. In contrast, the peak-driven rates for the slope cells were well described by a logistic function (for example, p values are .00016, .03988, and .01818 for coefficients a, b, c, respectively, in Fig. 5C): the driven spike rate increased with increasing stimulus duration from 2.5 to 20 ms and was saturated at 20–80 ms (Fig. 5C–D,

P < 0.05). For the edge-slope cells, the peak-driven rate was also well fit by the logistic function (Fig. 5E, F, P < 0.05), but the difference is that the function was shifted upward (higher d value) and the driven rate at 2.5 ms in duration is rather high around the level of edge cells (compare Fig. 5E, F with A, B). These results suggest that the slope cells have a relatively longer integration time window (20 ms) to evoke maximal peak responses (Fig. 5C, D), and that the edge-slope cells also have a longer time window (20 ms) to evoke maximal peak activities, but a short stimulus (2.5 ms) is still sufficient to evoke spike activities around the level of the edge cells (Fig. 5E, F).

Table 2. Results of data fitting Coeff.(conf. int.)P-value d d a a a a d d d d d d

⁄⁄⁄

.003(.08 to .08) f 77.7(74.6–80.8) .03(.09 to .16) f 74.1(69.3–78.9)⁄⁄⁄ 82.4(71.6–94.4)⁄⁄⁄ b 4.26(2.0–15.9)⁄ c .18(.09 to .40)⁄ 82.7(70.6–96.5)⁄⁄⁄ b 4.18(2.0–19.4)⁄⁄ c .17(.08–.44)⁄ 115.3(109.2–121.9)⁄⁄⁄ b .97(.59–1.75)⁄⁄ c 121.2(115.7–127.0)⁄ 121.2(115.7–127.0)⁄⁄⁄ b 2.13(1.28–4.29)⁄ c .32(.20–.53)⁄⁄ .0017(.0003–.003)⁄ f .038 (.02 to .09) .0003(.0057 to .0064) f .017(.006 to .006) .0006(.0001 to .0014) f .023(.005 to .051) .0008(.0003 to .0018) f .053(.013–.093)⁄ .0003(.0002 to .0008) f .020(.0002 to .0008) .0002(.0009 to .0013) f .021(.020 to .061)

Number of data

Figs.

73 73 30 30 34 34 73 30 34 73 30 34

5A 5B 5C 5D 5E 5F 7A 7B 7C 7D 7E 7F

Coefficients a–c for logistic function y = a/(1 + b  exp(c  x)), d and f for linear function y = d  x + f, 95% confidence interval, and P-value (⁄⁄⁄p < 0.001, ⁄⁄p < 0.01, ⁄ p < 0.05, no mark p > 0.05) are shown. Corresponding data are illustrated in the indicated figures. Corresponding fitting functions are also shown in the indicated figures when significant (p < 0.05).

316

J. Wang et al. / Neuroscience 256 (2014) 309–321 1.0

C o r r e l a t i o n c o e ff i c i e n t

.8

Slope

.6

Edge-slope

.4

.2 Edge 0 0

20

60 40 Sound duration (ms)

80

Fig. 6. Shift of stimulus durations resulting in a change of the correlation coefficient between responses to ramped and damped stimuli. Mean of correlation coefficient and standard error (vertical bar) is plotted against sound duration.

Time resolution for distinguishing between ramped and damped stimuli By shifting the stimulus duration from 80 to 2.5 ms (return to Fig. 4A–C), we also investigated the time resolution of each type for distinguishing between ramped and damped sounds. In Fig. 4A–C, all types of cell may distinguish between ramped and damped stimuli of 80 ms in duration based on the differential time courses (i.e. the edge cells have two response peaks at onset and offset periods of ramped stimuli; in contrast, they have only a single peak at the onset period of damped stimuli), while they may lose such ability by the shortening to 2.5 ms because of the similarity of response time courses during ramped and damped stimuli (i.e. the edge cells have only a single response peak for both ramped and damped stimuli). Such qualitative observations were quantitatively analyzed: the cross correlation between each cell’s responses (smoothed PSTHs during stimulus duration plus 80 ms from the stimulus onset) to ramped and damped stimuli was calculated and the average (at tau equal to zero) was plotted against the stimulus duration (Fig. 6). The correlation coefficients of the edge cells and the edge-slope cells were relatively high (range 0.67– 0.80) at short durations of stimuli (2.5–10 ms) and decreased to near zero (0.05–0.07) with increasing stimulus duration (20–80 ms), indicating that the ability to discriminate the temporal edge is relatively low at a short stimulus (2.5–10 ms) and becomes high with increasing stimulus duration (20–80 ms) in edge cells and edgeslope cells. On the other hand, in the slope cells, the response itself was lost or decreased during the short stimuli (2.5–10 ms) and the discrimination ability remained relatively low at longer stimuli (20–80 ms). Comparison between responses to ramped and damped stimuli To obtain physiological evidence for the asymmetrical perception reported by psychological studies, we

compared the responses of individual cells to ramped and damped stimuli. Supposing that the driven spike count underlies loudness perception, we calculated spike count index (SCI) by SCI = (Rs  Ds)/(Rs + Ds), where Rs and Ds are driven spike counts during ramped and damped stimuli, respectively. When SCI = 0, Rs = Ds, SCI = 0.33 indicates Rs/Ds = 2, and SCI = 0.33 means Ds/Rs = 2. The SCI was plotted against stimulus durations in the edge cells (Fig. 7A), where the SCI was fit by a linear model (P < 0.05, Table 2), indicating that SCI increases with increasing sound duration. At a stimulus duration of 80 ms, SCI was 0.18 (Rs/Ds = 1.4), indicating that the ramped stimuli evoked 1.4 times larger spike rate than the damped stimuli. In contrast, in the slope (Fig. 7B) and edge-slope cells (C), neither logistic nor linear model was significant (P > 0.05, Table 2), indicating that SCI is independent of the stimulus duration in the range of 2.5– 80 ms. There was a tendency for a preference to ramped stimuli: mean SCI were 0.03 and 0.04 (Rs/Ds = 1.1 and 1.1) for the slope and edge-slope cells, respectively. Supposing that the response duration underlies duration perception, we calculated response duration index (RDI) by RDI = (Rd  Dd)/(Rd + Dd), where Rd and Dd are the response durations during ramped and damped stimuli, respectively. The response duration was defined as the number of bins (bin width = 1 ms) of an individual cell’s PSTH where the driven rate was >2SD of the background firing rate. The RDI was plotted against stimulus durations in the edge (Fig. 7D), slope (E), and edge-slope cells (F), where neither logistic nor linear model was significant (P > 0.05, Table 2) in all three types, indicating that RDI is independent of the stimulus duration in the range of 2.5–80 ms. There was a tendency for a preference to ramped stimuli: mean RDI were 0.07, 0.03, and 0.03 (Rd/Dd = 1.2, 1.1, and 1.1) for the edge, slope, and the edge-slope cells, respectively. We then compared overall responses to ramped and damped stimuli (return to Fig. 4D, sum of mean PSTH of all types of cell). Overall response duration was defined as the number of bins (bin width = 1 ms) of overall PSTH where the response height was >mean + 2SD of the background firing rate. The overall response duration was plotted against the stimulus duration in ramped and damped stimuli (Fig. 8A). The response duration in the ramped stimuli was always longer than that in the damped stimuli, and the response duration increased with increasing stimulus duration in both ramped and damped stimuli. The response duration ratio (response duration for ramped stimuli divided by response duration for damped stimuli) is shown in Fig. 8B. There was a tendency for a preference to ramped stimuli: it was around 1.1 at relatively short (2.5–20 ms) stimulus duration, while it increased to 1.4 at a stimulus duration of 80 ms. The driven spike count was calculated in the overall response PSTH and was plotted against the stimulus duration in ramped and damped stimuli (Fig. 8C). The spike count in the ramped stimuli was always larger than that in the damped stimuli, and the spike count increased with increasing stimulus duration in both ramped and damped stimuli. The driven spike ratio

317

Response-duration index

Spike-count index

J. Wang et al. / Neuroscience 256 (2014) 309–321

.4

Edge

Slope

A

B

Edge-slope C

E

F

.2 0 -.2 -.4 y = .0017 x + .038

.4

D

.2 0 -.2 -.4 0

20 40 60 80

0 20 40 60 80 Sound duration (ms)

0

20 40 60 80

Fig. 7. Comparison between responses to ramped and damped stimuli in individual cells. Mean of spike-count index (open circle) and SD (vertical bar) is plotted against sound duration for the edge (A), slope (B), and edge-slope (C) cells. Linear regression line and its equation are shown in A. Mean of response-duration index (open circle) and SD (vertical bar) is plotted against sound duration for the edge (D), slope (E), and edge-slope (F) cells. Horizontal dotted lines show the level of index = 0.

(driven spike count for ramped stimuli divided by driven spike count for damped stimuli) is shown in Fig. 8D: it increased with increasing stimulus duration from 10 to 80 ms and it reached 1.2 at a stimulus duration of 80 ms. We noted that the overall response duration increases with increasing stimulus duration, and it was longer than the physical sound duration (see Figs. 4D and 8A). That is, the excitatory neuronal responses persist even after the termination of sound stimuli (persistence of excitation). This duration of persistence of excitation was measured as the number of bins (bin width = 1 ms) of an overall PSTH after the end of sound stimuli where the driven rate was >2SD of the background firing rate. It was plotted against the stimulus duration (Fig. 9). For the ramped stimuli, the duration of persistence of excitation was very stable (range 47–53 ms) regardless of the sound duration (2.5–80 ms), suggesting that the sound offset triggered the persistence of excitation with the stable duration. In contrast, for the damped stimuli, the persistence of excitation was a little bit shorter (42– 43 ms) than that of damped sounds at short stimuli (2.5–10 ms) and proportionally decreased to around 10ms duration at longer (80 ms) stimulus duration, suggesting that the main response was evoked at onset stimulus period and the sound offset no more evoked prominent persistence of excitation.

DISCUSSION Heterogeneity of the response type We explored the response properties of A1 neurons during the ramped and damped sound stimuli

(Fig. 1top). We found two contrasting types of cell: edge cells and slope cells (Figs. 1–4). The majority of the former are sensitive to the temporal edge such as the quick decay of the ramped sounds and the quick attack of the damped sounds. In contrast, the latter is sensitive to the slow attack of ramped sounds and the slow decay of damped sounds. Interestingly, the edge cells have a short (<2.5 ms) integration time for evoking the maximal peak activities, while the slope cells have a long (20 ms) integration time (Fig. 5). Differential duration of the integration time may underlie the observed neural response specificity. Short integration time is appropriate for the edge cell to signal the precise timing of the temporal edge, and long integration time is appropriate for the slope cell to compute precise slow slope of the sound envelope (integrate the amplitude change per time). Based on the cross correlation study of the response time course, we further showed that the time resolution of the edge cell for distinguishing the time difference between the quick decay and the quick attack is 20 ms (Fig. 6, r < 0.4). By measuring fMRI activity in human subjects listening to a real-life story, Lerner et al. (2011) found a hierarchical topography of integration time windows at the timescales of words (1 s), sentences (8 s), and paragraphs (38 s) in auditory cortex areas, in which A1 was sensitive to the momentary features (<1 s) of the auditory input regardless of the forward and backward of the story. Lerner et al. (2011) suggested that ‘‘the timescale of processing is a functional property that may provide a general organizing principle for the human cerebral cortex’’. We provided physiological evidence supporting

J. Wang et al. / Neuroscience 256 (2014) 309–321

140

4

120

Ramp

100 80

Damp

60

Spike count

A

40 1.5

C

Ramp

3 Damp 2

1 1.5

B Spike count ratio (ramp/damp)

Response duration ratio (ramp/damp)

Response duration (ms)

318

1.4 1.3 1.2 1.1

D

1.4 1.3 1.2 1.1 1.0

1.0 0

60 80 40 20 Sound duration (ms)

0

60 80 40 20 Sound duration (ms)

3HUVLVWHQFHRIH[FLWDWLRQ PV

Fig. 8. Comparison between responses to ramped and damped stimuli in overall responses. Overall response durations during ramped and damped stimuli are plotted against sound duration (A), and the ratio of response duration (ramp/damp) is plotted (B). Similarly, overall spike counts during ramped and damped stimuli are plotted against sound duration (C), and the ratio of spike counts (ramp/damp) is plotted (D).

 Ramp 

 Damp 



    6RXQGGXUDWLRQ PV

Fig. 9. Duration of persistence of excitation after the termination of ramped and damped sounds plotted against sound duration. Persistence of excitation zero means that there is no neuronal activation above 2SD of background firing rate after the end of the sound stimuli.

the interpretation that the timescale of processing (<1 s) underlies response sensitivity of A1 neurons. The heterogeneity of the response type during ramped and damped stimuli is not surprising for A1 cells under awake preparations. Although previous studies using pentobarbital anesthesia reported single response type at attack of sound envelope (Phillips, 1988; Heil, 1997), recent studies using awake preparations reported heterogeneity of the response type during sound stimuli such as tone burst (Chimoto et al., 2002; Qin et al., 2007), vocalization (Qin et al., 2008b), frequency modulation (Whitefield and Evans, 1965; Qin et al., 2008a), and click trains (Lu et al., 2001a). The present study adopted the ramped and damped stimuli used by Ries et al. (2008) and Schlauch et al.

(2001), in which an instantaneous attack of damped stimuli and instantaneous decay of ramped stimuli would result in a click at onset and offset, respectively. The spectral splatter at onset of the instantaneous attack has been shown in Fig. 1 of our previous study (Qin et al., 2003), where the spectral splatter increases with increasing the stimulus amplitude and with decreasing the analysis period and the carrier frequency when expressed in octave. The spectral splatter at half decay SPL (stimuli of 4 kHz in frequency and 50 dB SPL) was 0.9, 1.3, 1.9 and >5.7 octaves at analysis period of 8, 6, 4, and 2 ms (Qin et al., 2003). It has been reported that the click train evokes two contrasting types of neuronal responses in A1: synchronized responses and non-synchronized rate responses (Lu et al., 2001a). The former synchronizes to each click, and the latter shows sustained responses at short interclick intervals (Lu et al., 2001a). Thus, there is a possibility that our edge cell corresponds to the synchronized response cells of Lu et al. (2001a). This possibility should be clarified in future studies.

Neural responses executing persistence of perception This study showed physiological evidence that the overall response duration and spike count during ramped sound reached 1.4 times longer and 1.2 times larger than that during damped sound at stimulus duration of 80-ms and 50-dB SPL, respectively (Fig. 8). These physiological results in cats may correspond to the psychological results in humans that the ramped sounds were 1.5–2.0 times longer in subjective duration and 2–3 dB louder in strength than damped sounds (Schlauch et al., 2001; Ries et al., 2008).

J. Wang et al. / Neuroscience 256 (2014) 309–321

To explain the asymmetrical perception, two hypotheses were proposed: perceptual constancy (Stecker and Hafter, 2000) and persistence of perception (Digiovanni and Schlauch, 2007). The former supposes that the listener eliminates the slow decay of damped sounds from the judgment of loudness (Stecker and Hafter, 2000), while the latter supposes that the sound’s internal representation in the nervous system continues after the source has stopped (Digiovanni and Schlauch, 2007; Ries et al., 2008). In this study, we found the slope cells sensitive to both the slow attack of ramped stimuli and the slow decay of damped sounds and the edge cells sensitive to both the quick attack of damped sounds and the quick decay of ramped sounds. We found neither specific cell sensitive to only the attack nor cells sensitive to only the decay. Thus, the majority of A1 neurons are tuned to the velocity (slow or quick) but not to the direction (attack or decay) of the amplitude change. Because specific direction cells distinguishing the attack and decay are lacking in A1, the neural mechanism for perceptual constancy may not yet be executed at the level of A1. The negative results in A1 do not deny perceptual constancy hypothesis. It is possible that, in the higher auditory cortex, there are specific neurons sensitive to only the attack or decay of ramped/damped sounds for distinguishing between the attack and the decay of ramped/damped sounds, executing the perceptual constancy. On the other hand, this study provided physiological evidence supporting the persistence of perception for subjective sound duration. That is, ramped stimuli but not damped stimuli evoked strong offset responses (Fig. 4D top). This asymmetry may originate mainly from the offset responses of the edge cells after an instantaneous decay of the ramped stimuli (Fig. 4A top). The persistence of neuronal excitation after the end of the stimulus (persistence of excitation) was directly presented (Fig. 9), suggesting that the offset of ramped sound elicited the strong persistence of excitation regardless of the stimulus duration, in contrast, the damped sound offset did not. This asymmetry of the persistence of excitation may play a role for asymmetrical subjective perception of loudness and sound duration. This physiological finding is in good agreement of psychological finding (DiGiovanni and Schlauch, 2007) that reported asymmetry of persistence of perception of ramped and damped sounds by temporal masking experiments in humans. Rectangular gated sounds, which also end at the sound’s peak SPL, are judged to be equal in duration to that of ramped sounds with the same physical duration (DiGiovanni and Schlauch, 2007). Presumably, a rectangular gated sound would produce this persistence as well. The data from this study support the persistence of perception explanation and these data might also contribute to the debate about the mechanisms underlying forward masking. Relationship of this study to previous physiological studies There have been only a few physiological studies on the neural responses to stimuli with temporal asymmetry. In

319

the ventral cochlear nucleus of the guinea pig, primarylike units, assumed to reflect the activity of the peripheral auditory nerve fibers, were shown to produce temporal firing patterns similar to the stimulus temporal envelopes: the firing rate quickly increased to the maximum firing level followed by a gradual decrease to the background firing level during damped stimuli, while it gradually increased to the maximum followed by a quick decrease during ramped tones, showing least response asymmetry comparable to the human perception asymmetry (Pressnitzer et al., 2000). Prominent response asymmetry was found in the onset units of the ventral cochlear nucleus that showed larger phasic responses at the beginning of damped stimuli than those of ramped stimuli, while chopper units showed larger total spike counts during ramped stimuli than during damped stimuli (Pressnitzer et al., 2000). Previously, only one physiological study (Lu et al., 2001b) focused on the A1 neuron responses to ramped and damped sounds, which showed that a greater population of neurons showed higher firing rates to the concatenated ramped sinusoids than to the damped sinusoids in monkey. Lu et al. (2001b) suggested that this contributes to the perceived loudness asymmetry. Although their stimulus condition is different from ours (ramp/damp is single or concatenated), we also showed that the overall spike count of the ramped sounds is 1.2 times larger than that of the damped sounds (Fig. 7D), which may originate mainly from the edge cells (Fig. 6A). The prior study (Lu et al., 2001b) that have examined ramped and damped sounds have presented trains of concatenated ramped and damped sounds and as such have been unable to examine the time course of the firing pattern to these sounds, including the period after the sound terminates. This is the first physiological study of cortical neurons to present single presentations of ramped and damped sounds which allows us to test a persistence explanation for the observed perceptual differences. Relations between neural responses and perception As mentioned just above, Lu et al. (2001b) and we (this study) suggested that louder perception of sounds is dependent on increased neuronal firing in A1. One would claim that ‘‘higher sound intensity (so presumably inducing louder perception) evokes a decreased firing rate in non-monotonic neurons, so being in an inverse proportional relationship between response and perception, and, therefore, assignment of neural responses to asymmetrical perception on the basis of firing rates alone seems to be less convincing’’. This claim might be true if we had compared neuronal responses to different intensities of ramped and damped stimuli. We, however, used the same intensity of the two sounds for comparison of neural firing rates. The non-monotonic neurons (and also the monotonic neurons) should have the same response heights to the two sounds of the same intensity regardless of the neuronal response type. We found differences in neuronal spike counts between the ramped and damped sounds even if the physical intensity is the same

320

J. Wang et al. / Neuroscience 256 (2014) 309–321

(Fig. 8C, D). It may be reasonable to suggest that this firing asymmetry contributes to the loudness asymmetry. There is another argument about the firing asymmetry. Considering the temporal development of the intensity levels of ramped and damped sounds envelopes, that is, starting with a lower and ending in a higher intensity for ramped sounds and vice versa for damped sounds, such an envelope asymmetry is likely to affect the firing magnitude for non-monotonic neurons, but probably not for monotonic neurons. For non-monotonic neurons, this temporal asymmetry between the two-sound types would change the response symmetry when sound intensities are changed, that is, a shifting from ramped-damped-equal response symmetry to ramped-greater-than-damped asymmetry when sound level is changed from low to high. However, it may not be the case for monotonic neurons. Therefore, the assignment of asymmetrical loudness perception to one type of stimulus sound on the basis of a firing asymmetry at a certain sound level cannot support it for a wide range of sound intensity for non-monotonic neurons. That is the reason why neurons’ rate level function could also be concerned. Similarly, one may suggest us to provide evidence that perception of the absolute duration is unrelated to the asymmetrical perception to ramped and damped sounds. However, no one has ever reported that humans can perceive the absolute physical duration of ramped and damped sounds, suggesting less possibility of neuronal mechanisms measuring the absolute physical duration of ramped and damped sounds. We provided physiological evidence that (1) overall neuronal response duration is longer than the physical sound duration regardless of the sound duration (Fig. 8), (2) neuronal excitation persisted after the termination of sounds (Fig. 9), and (3) duration of the persistence of excitation is longer for ramped sounds than damped sounds even if the physical sound duration is the same (Fig. 9), which directly corresponds to the psychological finding of persistence of perception after the termination of sounds. It is beyond the scope of this paper to provide evidence for perception of absolute sound duration. It should be explored in higher auditory cortices in future studies if available. In this study, to explore the neuronal response duration, we adopted the mean + 2SD of the background firing rate as the threshold. One would claim that ‘‘this is quite arbitrary and seems to be underestimated if considering the prominent response peak’’, and the one would recommend that ‘‘the threshold should be set at the half peak level’’. We do not think that the threshold of the mean + 2SD is quite arbitrary. The mean + 2SD includes 95.5% of the firing rate modulation during non-stimulation period, and we can statistically estimate that the firing rate is significantly (p < 0.05) higher than the background level, indicating that sound stimuli surely elicited neuronal responses. We defined the neuronal response duration as the period of existing neuronal responses. The half peak level is quite arbitrary as the threshold of

neuronal responses because (1) the threshold level substantially changes depending on the change of the peak response level, (2) there is no statistical evaluation for the half peak level, and (3) it is unreasonable to suppose that the firing rate only above the half peak level (but not the below) contributes to the loudness perception.

CONCLUSION It is known that humans hear ramped (slow attack and quick decay) sounds louder and longer than damped (quick attack and slow decay) sounds. We found that A1 includes edge cells sensitive to the quick attack and quick decay and slope cells sensitive to the slow attack and slow decay. Integration time for evoking full responses is short for edge cells and long for slope cells, underlying the sensitivity. Persistence of excitation after ramped sounds is longer than that after damped sounds caused mainly from edge cells. This physiologically supports psychological reports that persistence of perception is longer after ramped sounds than damped sounds. Acknowledgements—This study was supported by Strategic Research Program for Brain Sciences (SRPBS #08038015) from the Japanese Ministry of Education, Culture, Sports, Science and Technology (MECSST) to Y.S. and L.Q., a Grant-in-Aid for Scientific Research (B) (No. 20300076) from the Japan Society for the Promotion of Sciences (JSPS) to Y.S., and grants from the National Nature Science Foundation of China (Nos. 30700938, 30970979) to L.Q. We thank Prof. N. Nakamoto for advice on data processing procedures. The authors declare no competing financial interests.

REFERENCES Chimoto S, Kitama T, Qin L, Sakayori S, Sato Y (2002) Tonal response patterns of primary auditory cortex neurons in alert cats. Brain Res 934:34–42. Cutting J (1982) Plucks and bows are categorically perceived, sometimes. Percept Psychophys 31:462–476. Cutting JE, Rosner BS (1974) Categories and boundaries in speech and music. Percept Psychophys 16:564–570. DiGiovanni JJ, Schlauch RS (2007) Mechanisms responsible for differences in perceived duration for rising-intensity and fallingintensity sounds. Ecol Psychol 19:239–264. Heil P (1997) Auditory cortical onset responses revisited. II. Response strength. J Neurophysiol 77:2642–2660. Irino T, Patterson RD (1996) Temporal asymmetry in the auditory system. J Acoust Soc Am 99:2316–2331. Lerner Y, Honey CJ, Silbert LJ, Hasson U (2011) Topographic mapping of a hierarchy of temporal receptive windows using a narrated story. J Neurosci 31:2906–2915. Lu T, Liang L, Wang X (2001a) Temporal and rate representations of time-varying signals in the auditory cortex of awake primates. Nat Neurosci 4:1131–1138. Lu T, Liang L, Wang X (2001b) Neural representations of temporally asymmetric stimuli in the auditory cortex of awake primates. J Neurophysiol 85:2364–2380. Lutfi RA, Stoelinga CN (2010) Sensory constraints on auditory identification of the material and geometric properties of struck bars. J Acoust Soc Am 127:350–360. Patterson RD (1994a) The sound of a sinusoid: spectral models. J Acoust Soc Am 96:1409–1418.

J. Wang et al. / Neuroscience 256 (2014) 309–321 Patterson RD (1994b) The sound of a sinusoid: time-interval models. J Acoust Soc Am 96:1419–1428. Phillips DP (1988) Effect of tone-pulse rise time on rate-level functions of cat auditory cortex neurons: excitatory and inhibitory processes shaping responses to tone onset. J Neurophysiol 59:1524–1539. Pressnitzer D, Winter IM, Patterson RD (2000) The responses of single units in the ventral cochlear nucleus of the guinea pig to damped and ramped sinusoids. Hear Res 149: 155–166. Qin L, Kitama T, Chimoto S, Sakayori S, Sato Y (2003) Time course of tonal frequency-response area of primary auditory cortex neurons in alert cats. Neuroscience Res 46:145–152. Qin L, Chimoto S, Sakai M, Wang J, Sato Y (2007) Comparison between offset and onset responses of primary auditory cortex on-off neurons in awake cats. J Neurophysiol 97: 3421–3431. Qin L, Wang J, Sato Y (2008a) Heterogeneous neuronal responses to frequency-modulated tones in the primary auditory cortex of awake cats. J Neurophysiol 100:1622–1634.

321

Qin L, Wang J, Sato Y (2008b) Representations of cat meows and human vowels in the primary auditory cortex of awake cats. J Neurophysiol 99:2305–2319. Qin L, Liu Y, Wang J, Li S, Sato Y (2009) Neural and behavioral discrimination of sound duration by cats. J Neurosci 29:15650–15659. Ries DT, Schlauch RS, DiGiovanni JJ (2008) The role of temporalmasking patterns in the determination of subjective duration and loudness for ramped and damped sounds. J Acoust Soc Am 124:3772–3783. Rosen SM, Howel P (1981) Plucks and bows are not categorically perceived. Percept Psychophys 30:156–168. Schlauch RS, Ries DT, DiGiovanni JJ (2001) Duration discrimination and subjective duration for ramped and damped sounds. J Acoust Soc Am 109:2880–2887. Stecker GC, Hafter ER (2000) An effect of temporal asymmetry on loudness. J Acoust Soc Am 107:3358–3368. Whitefield IC, Evans EF (1965) Responses of auditory cortical neurons to stimuli of changing frequency. J Neurophysiol 28:655–672.

(Accepted 22 October 2013) (Available online 28 October 2013)