Mathematical Biosciences 197 (2005) 15–34 www.elsevier.com/locate/mbs
Intensity-difference limens predicted from the click-evoked peripheral N1: The mid-level hump and its implications for intensity encoding Lance Nizami
*
Center for Hearing Research, Boys Town National Research Hospital, 555 North 30th Street, Omaha, NE 68131, United States Received 12 April 2004; accepted 14 April 2005 Available online 14 July 2005
Abstract The intensity-difference limen (DL) for an acoustic click rises at moderate click levels, a feature called the Ômid-level humpÕ. It has long been hypothesized that, because a click does not evoke sustained firing in any primary afferent, the DL must therefore originate from the initial burst of synchronized spikes in the eighth nerve. That burst causes the N1 component of the peripheral compound action potential (CAP). It should therefore be possible to predict click DLs from N1 potentials. Here, a Signal Detection model, using a series expansion, was used to derive equations in N1 for the level-dependence of the DL. The first-order equation predicts a dependence on the standard deviation of N1, and an inverse dependence on the rate-of-growth of the mean N1. The second-order equation is more complicated. Both approximations were applied to N1s from the cat. Both produced a mid-level hump; at its peak, the DLs from the second-order approximation were the smaller ones, and were of the same order of magnitude as the empirical DLs. Overall, the computations show that the rate-of-growth of the mean N1, not the standard deviation of N1, determines the hump in the empirical DL. Ó 2005 Elsevier Inc. All rights reserved. Keywords: Intensity-difference limen; N1; Compound action potential; Signal Detection Theory
*
Tel.: +1 404 299 5530. E-mail address:
[email protected]
0025-5564/$ - see front matter Ó 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.mbs.2005.04.006
16
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
1. Introduction The present investigation concerns the auditory intensity-difference limen, henceforth called the DL. Intriguingly, the level-dependence of the DL differs by stimulus duration. For impulses of 0.3 ms [1] or for clicks of 0.1 ms [2], presented to listeners as bursts of differing amplitude, the DL rises at moderate levels, the Ômid-level humpÕ. Fig. 1 illustrates this behavior. In contrast, the DL for broadband noises of 1.5 s presented in a background of broadband noise is constant with level [4], called ÔWeberÕs lawÕ, and the DL for 500 ms pure tones presented as bursts gradually improves with level for a broad range of levels and frequencies, the Ônear-miss to WeberÕs lawÕ (e.g., [5]). The level-dependence of the DL for clicks must represent a different physiological response than that for longer tones or noises. As Radionova explains: ‘‘in response to a brief sound signal, even when it has large intensity, the neurons of the auditory system can produce only several spikes (no more than 4–5), i.e., individual neurons can only achieve a very coarse measurement of the intensity of a brief signal, and the auditory system must use a different mechanism to achieve a more exact measurement – the mechanism based on the number of elements that respond synchronously to the signal’’ ([6, p. 350]). The neuronÕs response to a click is one or more voltage spikes of successively declining probability of generation. When enough neurons fire initial spikes with sufficient synchrony across neurons, the result is the N1 component of the peripheral compound action potential (CAP) (e.g., [7–9]). The actual potential that each afferent gives to the N1 is a biphasic waveform that resembles perhaps 1.5 periods of a lightly damped sinusoid [10–12], a waveform that is equal in size within-species, regardless of the neuronÕs characteristic frequency (CF) or its rate of discharge [8,13]. The N1 potentials were used by Radionova [6] to interpret the mid-level hump seen by Avakyan and Radionova [1]. A brief review of the method serves as an introduction to the present computations. Radionova [6] estimated ‘‘the number of elements that respond synchronously to the sig-
Fig. 1. Measured DLs in decibel units of 10 log10(1 + [DI]/I). Shown are the just-detectable intensity-increments for clicks for the two subjects of Raab and Taub ([3], reported as Weber fractions in [2]), and just-detectable intensitydecrements for clicks for the two subjects of Avakyan and Radionova [1].
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
17
nal’’ by assuming that the number of primary afferents that respond to a click was proportional to the magnitude of the N1 potential, a value simply called ÔN1Õ. (This assumption was apparently derived from Frishkopf [14], who had obtained N1s in cats.) Another necessary component of RadionovaÕs calculations was the variability of the responding afferent population, taken as the square of the standard deviation of the distribution of N1 that resulted from repetition of a given click. She further assumed that the latter standard deviation did not change when the click was made justdetectably higher in level. That is, she made a Signal Detection assumption, that a small change in the level of the stimulus produces a negligible change in the standard deviation of the response. RadionovaÕs computed DLs peaked at about 60 dB SL. There, her empirical plot of mean N1 vs. level showed a ÔkneeÕ, a now well-known decline in slope down to a horizontal line. The knee may even become a dip [14,15]. Fig. 2 illustrates the mean N1 curve with the knee and the mean N1 curve with the dip. Taub and Raab [16] also computed DLs from physiological responses to clicks, using Guinea pigs. Like Radionova [6] they assumed that the distributions of N1 had different means, but the same standard deviation, for neighboring click levels. The DLs, plotted vs. click level, peaked at a level about 38–48 dB above a human listenerÕs click-detection threshold. The standard deviations of the N1 distributions also peaked within the same range of levels, leading Taub and Raab to conclude that N1 variability accounted for the mid-level hump in human DLs for clicks. N1 variability reflects the variability of the count of synchronized click-evoked spikes. A dependence of the DL on variability is at odds with the common wisdom that the DL is approximately inversely proportional to the derivative of the sensory variable, that is, to the rate-of-growth of the synchronous spike count. The empirical findings of Avakyan and Radionova [1] and Raab and Taub [2] are still cited in the literature as defining the DLs for the click. However, the computations of Radionova [6] and of Taub and Raab [16] can be considered incomplete. Although all employed the equal-variance assumption of Signal Detection Theory, the Signal Detection model was not used in full. For a full treatment, other physiological data are required besides that of Radionova [6] or that of Taub and Raab [16]. Those studies had, in fact, collected N1 means and standard deviations, necessary for a full Signal Detection treatment. Radionova did not publish the standard deviations. Taub
Fig. 2. Curves of mean N1. The ordinate represents a linear scale. (Left) The curve with the mid-range knee. Circles show the inflection points at which the slope of the curve changes from increasing to decreasing. (Right) The mid-range dip. The edges of the dip are identified as the hill and the valley.
18
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
and Raab published the means and standard deviations, but produced a comprehensive plot of predicted DLs for only one animal. Frishkopf [14], fortunately, systematically plotted N1 means and standard deviations for several cats, but he did not infer DLs. That is done here for the acoustic click, by first describing the Signal Detection model and how it allows equations for the DL. Some proofs have been omitted for brevity, and are available from the author on request. Early computations appeared elsewhere [17]. 2. Method: predicting click DLs from the N1 potential 2.1. The Signal Detection model 2.1.1. The background assumptions The computations start with the assumption that when a click of level x is repeatedly presented, the resulting spike counts N(x) across the eighth nerve have a Gaussian probability density function with mean value NðxÞ and standard deviation rN(x). Here x is always expressed in a particular decibel scale that will be explained shortly. The Gaussian assumption is justified by the generally smooth and bell-shaped [18,19] empirical count distributions for individual primary afferents from all three of the spontaneous-rate classifications recognized by Liberman [20]. If any primary afferentÕs spike count is a Gaussian-distributed random variable, then the sum of those counts is also Gaussian distributed [21]. This assumes that each spike is infinitely short, so that no reduction of spike count due to mutual interference takes place. The difference limen, Dx, is expressed as 10 log10(1 + [DI ]/I ). Incrementing x by Dx causes the Gaussian probability density function to shift so that its mean spike count is a higher value, Nðx þ DxÞ. Assume now that rN(x) = rN(x + Dx). Then a fixed level of performance on a discrimination task corresponds to a fixed value of the detectability index d 0 , where d0 ¼
DNðxÞ N ðx þ DxÞ NðxÞ ¼ . rN ðxÞ rN ðxÞ
ð1Þ
Fig. 3 illustrates the Signal Detection model in terms of spike count. Note well that Eq. (1) does not return a number for Dx, because the term d 0 has to have a value assigned by the user, besides which N ðx þ DxÞ, N ðxÞ, and rN(x) are not known a priori. As noted earlier, it is not presently possible to count the initial, synchronous voltage spikes that are evoked across the eighth nerve by a click. The eighth-nerve spikes do, however, result in a measurable epiphenomenon, the N1 component of the peripheral compound action potential (CAP). 2.1.2. Development of the equations Each of the elementary potentials that contributes to N1 contributes equally [8,13]. Eq. (1) can thus be expressed in terms of N1 by assuming that N1 is proportional to the click-evoked count of synchronous initial spikes, N1(x) = kN(x). A value need not be specified for k, as will be seen. Under N1(x) = kN(x), N1(x) is also Gaussian distributed with mean value N 1 ðxÞ ¼ kN ðxÞ and variance r2N 1 ðxÞ ¼ k 2 r2N ðxÞ. Substituting into Eq. (1), DN 1 ðxÞ ¼ N 1 ðx þ DxÞ N 1 ðxÞ ¼ d 0 rN 1 ðxÞ.
ð2Þ
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
19
Fig. 3. The Signal Detection model of discriminability of intensity change. The internal response, here spike count N, can differ from stimulus presentation to stimulus presentation, according to a Gaussian distribution. Incrementing the stimulus level from x decibels to x + Dx decibels causes a shift in the distribution, such that the mean spike count changes from NðxÞ to N ðx þ DxÞ, while the standard deviation rN (spikes) remains the same. When spike count is not Gaussian distributed, or when the two Gaussians have different variances, it may be best to assume other distributions, such as the Poisson (e.g., [22]). The appropriate algebra is detailed elsewhere [21].
The change in N 1 ðxÞ over the level change Dx is dN 1 ðxÞ 1 d2 N 1 ðxÞ ðDxÞ2 þ OððDxÞ3 Þ ð3Þ Dx þ dx 2 dx2 where O((Dx)3) refers to an infinity of terms of decreasing magnitude. The expansion method is taken from Hellman and Hellman [23] who used it in a different context. From Eqs. (2) and (3), dN 1 ðxÞ 1 d2 N 1 ðxÞ ðDxÞ2 þ OððDxÞ3 Þ. ð4Þ Dx þ d 0 rN 1 ðxÞ ¼ dx 2 dx2 Eq. (4) allows the DL, Dx, to be determined to any desired order of approximation. The derivation of the first-order and second-order approximations is trivial. Although the slope dN 1 ðxÞ=dx in Eq. (4) can be obtained using nearest-neighbors approximations from the published plots of mean N1 vs. x, d2 N 1 ðxÞ=dx2 is harder to infer. The values of dN 1 ðxÞ=dx and of d2 N 1 ðxÞ=dx2 for any given x are therefore obtained first by fitting some smooth curve N 1 ðxÞ by least-squares regression to each of FrishkopfÕs plots of mean N1 vs. x, as will be seen, then by evaluating the derivatives dN 1 ðxÞ=dx and d2 N 1 ðxÞ=dx2 for the given x. Dx itself could be expressed as a smooth function if rN 1 ðxÞ, the standard deviation as a function of level, could be described by an equation, but the level-dependencies plotted by Frishkopf [14] are too choppy, and vary idiosyncratically from cat to cat and from recording location to recording location. Consequently, standard deviations were digitized point-by-point, allowing the DL to be evaluated for whatever click level the rN 1 was available. The rN 1 s appear in the upcoming illustrations. DN 1 ðxÞ ¼ N 1 ðx þ DxÞ N 1 ðxÞ ¼
2.2. The DL: the first-order and second-order approximations From Eq. (4), to first order, Dx ðdecibelsÞ d 0
rN 1 ðxÞ . dN 1 ðxÞ dx
ð5Þ
20
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
d 0 = 1 for all of the DLs subsequently illustrated. Eq. (5) was derived by truncation of an infinite series, and it therefore seems wise to compare the predictions of the second-order approximation to those of the first-order approximation. The second-order approximation turns out to provide some theoretical restrictions that are not obvious from Eq. (5). For plots of mean N1 that have positive slopes, the second-order approximation gives two possible solutions: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 dN 1 ðxÞ dN 1 ðxÞ d2 N 1 ðxÞ þ þ 2d 0 rN 1 ðxÞ dx dx dx2 ; ð6aÞ Dx ðdecibelsÞ d2 N 1 ðxÞ dx2 sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 dN 1 ðxÞ dN 1 ðxÞ d2 N 1 ðxÞ þ 2d 0 rN 1 ðxÞ dx dx dx2 . ð6bÞ Dx ðdecibelsÞ d2 N 1 ðxÞ dx2 The steepest point of each of the empirical rising segments of N 1 ðxÞ (see Fig. 2) is the inflection point at which N 1 ðxÞ changes from negative to positive acceleration with rising level, i.e., where d2 N 1 ðxÞ=dx2 ¼ 0 but dN 1 ðxÞ=dx 6¼ 0. (This is not the knee, where, if flat, d2 N 1 ðxÞ=dx2 ¼ 0 and dN 1 ðxÞ=dx ¼ 0.) At each inflection point, Eq. (6b) produces a discontinuity but Eq. (6a) simplifies to Eq. (5). These Dxs provide the lowest computed DLs. Eqs. (6a) and (6b) apply to the segments of the plot of mean N1 for which the N1 is increasing in magnitude with increase in click level. Special attention is required, however, when the mean N1 shows a dip, i.e., where dN 1 ðxÞ=dx < 0. Within a dip, a just-detectable-increase in click level x corresponds to a just-detectable-decrement in mean N1 and, presumably, to a decrease in mean spike count. To maintain a positive d 0 , the order of the N 1 terms in Eq. (2) must be reversed: d0 ¼
N 1 ðxÞ N 1 ðx þ DxÞ . rN 1 ðxÞ
ð7Þ
The resulting first-order approximation to Dx yields the negative of the expression to the right of the equals sign in Eq. (5); as the slope in the denominator is negative within the dip, the resulting DLs are positive. The second-order approximation has two solutions, neither of which can be excluded. These are the same as Eqs. (6a) and (6b) except that the plus sign preceding the second term within the square root is replaced by a minus sign.1 1
At the top of the hill (see Fig. 2) marking the left-hand-side of the dip, or the bottom of the valley marking the righthand-side of the dip (Fig. 2), the slope of N 1 ðxÞ approaches zero. There, the first-order approximations to the DL approach infinity, but the respective second-order approximations continue to provide finite values for Dx. There are some circumstances, however, in which the first-order approximation is the sole solution. For example, N 1 ðxÞ must have an inflection point within the dip (right panel of Fig. 2), at which d2 N 1 ðxÞ=dx2 ¼ 0. There the second-order approximation with the negative sign in front of the square root becomes discontinuous, and the other second-order approximation reverts to the first-order approximation. For a true knee, dN 1 ðxÞ=dx and d2 N 1 ðxÞ=dx2 will both be zero. Eq. (5) yields Dx ! 1, and neither Eq. (6a) nor Eq. (6b) yield finite positive values of Dx; the DL is either infinite or undefined. The lack of a finite DL for a flat knee is a regrettable artifact of the model, but one that can be safely ignored; the author has yet to see a truly flat knee in the mean N1.
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
21
3. Results: predicted DLs for the click In the plots of mean N1 produced by Frishkopf [14] the units of the abcissa were ÔdB below maximumÕ, where the latter were set by Frishkopf. Frishkopf Õs units are preserved here for the sake of continuity. The particular form of N 1 ðxÞ is unimportant; N 1 ðxÞ need only provide a satisfactory fit to the data. The details of the function and fit are relegated to a footnote.2 For positive values of N 1 ðxÞ, DLs were obtained from Eqs. (6a) and (6b). For the lower half of each sigmoidal segment, d2 N 1 ðxÞ=dx2 > 0 so that Eq. (6a) always yields Dx > 0 and Eq. (6b) always yields Dx < 0, so that only Eq. (6a) applies. However, Eqs. (6a) and (6b) can both potentially yield Dx > 0 for the upper half of any sigmoid, where d2 N 1 ðxÞ=dx2 < 0, as long as the term under the square-root sign is positive. That situation must be evaluated point-by-point.3 It always
2
The mean N1s for Cat 341 Location 3 were fitted to N 1 ðxÞ ¼ N 1 ðx; a1 ; a2 ; a3 ; a4 ; a5 ; a6 ; a7 ; a8 ; a9 Þ ¼ a1 N 1;A ðx; a2 ; a3 ; a4 ; a5 Þ þ ð1 a1 Þ N 1;B ðx; a2 ; a3 ; a6 ; a7 Þ þ a8 xa9 ;
where ða2 a3 Þ h
N 1;A ðxÞ ¼
1 þ 49
12
xþ92a4 a5
i þ a3 ;
N 1;B ðxÞ ¼
ða2 a3 Þ h 1 þ 49
12
xþ92a6 a7
i þ a3 ;
(after [24,25]). The number 92 appears because the units of ÔdB below maximumÕ used by Frishkopf are negative numbers, which made it difficult to obtain fitted smooth functions N 1 ðxÞ. Hence, the click level x for each data point {xi, N 1;i }, where N 1;i is the ith mean N1 value of m values in total, was converted to a positive value in the equation, by adding 92. Fitted next were the mean N1s for Cat 349 Location 1. No equation could be found that fit well to the mean N1s at 20 and 10 dB; a premature plateau always resulted. Consequently, those mean N1s were omitted from all calculations. The remaining values fitted well to N 1 ðxÞ ¼ N 1 ðx; a1 ; a2 ; a3 ; a4 ; a5 ; a6 ; a7 ; a8 ; a9 Þ ¼ a1 N 1;A ðx; a2 ; a3 ; a4 ; a5 Þ þ ð1 a6 Þ N 1;B ðx; a2 ; a3 ; a4 ; a7 Þ þ ða6 a1 Þ N 1;C ðx; a2 ; a3 ; a8 ; a9 Þ; where ða2 a3 Þ h
N 1;A ðxÞ ¼
12
12
xþ92a8 a9
1 þ 49 ða2 a3 Þ h N 1;C ðxÞ ¼ 1 þ 49
i þ a3 ;
xþ92a4 a5
N 1;B ðxÞ ¼
ða2 a3 Þ h 1 þ 49
12
xþ92a4 a7
i þ a3 ;
i þ a3 .
For both sets of mean N1s, the expression minimized in the regression routine was 2 m X N 1 ðxi Þ N 1;i i¼1
N 1;i
;
where N 1 ðxi Þ is the value of the fitted curve for fxi ; N 1;i g. 3 The term within the square-root sign in Eqs. (6a) and (6b) eventually becomes negative as a plateau is approached from either above or below, level-wise. This discontinuity occurs for 64 and 60 dB in Fig. 4, and for 55, 45, and 20 dB in Fig. 5. In these cases, the first-order approximation continues to provide the DL.
22
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
Fig. 4. Application of the Signal Detection model to measurements of mean N1 and rN 1 from cat 341 of Frishkopf [14, Fig. 30]. (Left) Mean N1 potentials. (Middle) The standard deviations associated with the mean N1 potentials. (Right) The predicted DLs, Dx.
Fig. 5. Application of the Signal Detection model to measurements of mean N1 and rN 1 from cat 349 of Frishkopf [14, Fig. 32]. (Left) Mean N1 potentials. (Middle) The standard deviations associated with the mean N1 potentials. (Right) The predicted DLs, Dx.
turns out to be Eq. (6a) whose estimates of Dx are the most compatible with actual human performance. Figs. 4 and 5 show mean N1 and rN 1 , and the DLs predicted from Eq. (6a). At moderate click levels, the first-order approximation can produce larger DLs than the second-order approximation. The second-order DLs are not incompatible with the empirical DLs at the mid-level hump. Fig. 6 compares empirical and computed DLs. Predicted DLs were adjusted to an SL scale according to the statement made by Frishkopf and Rosenblith [26] that the human detection threshold for the click used by Frishkopf [14] was Ôabout 95 dBÕ. Note that the peak in the Avakyan and Radionova [1] DLs appears at a higher SL than that for the Raab and Taub [2] DLs, and that the peaks of the predicted DLs fall between these two extremes, and are of the correct order of magnitude. The peaks of the predicted DLs would be lower if the predictions were based on N 1 s whose rate-of-growth merely slowed at mid-levels rather than dropping right down to zero (a
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
23
Fig. 6. Comparison of measured and predicted DLs for clicks. The dotted line joins the measured DLs averaged over the two subjects of Avakyan and Radionova [1, see Fig. 1]. The dashed line joins the measured DLs averaged over the two subjects of Raab and Taub [2, see Fig. 1]. The solid lines show computed DLs, the second-order approximations from Figs. 4 and 5, respectively.
knee) or becoming negative (a dip). This slowing is hard to perceive in plots of mean N1 for man unless N 1 is plotted in logarithmic scales, where the slowing can be seen at least for local components of the N1 (e.g., [27]).4
4. Discussion 4.1. The alleged frequency bias of the N1 There are many who believe that N1 only represents the activity of neurons in the basal, high frequency part of the cochlea. Such a bias, if true, would restrict the present model to only stimuli composed of high-frequency components. This important issue deserves an explanation, which is followed by a rebuttal. First, note that the N1 can be separated experimentally into contributions from contiguous portions of the basilar membrane whose afferents respond to the click, the Ôderived APsÕ [29]. The existence of these differing localized contributions to the CAP is well established. It is also well established that they differ in latency, that is, the delay between stimulus arrival (at the eardrum, for instance) and the appearance of the N1 component at the recording site along the cochlea (e.g., [30–35]). Latencies of N1 components are always found to be inversely
4 The N1 varies notoriously across placement and across species in ways still not well defined. Behavioral DLs are crucial for comparison, but the author could find no mention of click DLs in Guinea pigs, and found only a single study of the click DL in cats. There, four carefully trained animals produced DLs that ranged from 4.1 to 4.9 dB across animals at a click level equivalent to 44 dB SPL (SL unknown, [28]; Saunders, personal communication). Those DLs are not incompatible with the DLs computed here.
24
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
dependent on CF, i.e., directly dependent on distance from the base of the cochlea, thanks to the traveling wave on the basilar membrane and the tonotopic layout of CF from apex to base. Lower latency means greater synchrony of the unitary waveforms that add to build the N1, that is usually recorded at the round window, hence there is an increasingly bigger contribution to N1 with center frequency of the contributing frequency band [36]. That is, based on latency, there should be a monotonic relation of derived N1 amplitudes to CF. This is the alleged high-frequency bias of the N1. The data, however, do not support such a bias. The click-evoked N1 is composed of contributions from regions of the basilar membrane that represent the frequency composition of the waveform that reaches the cochlea (e.g., [37]). An audiogram for the CAP can be constructed from the thresholds to tone-pips a few milliseconds long, which evoke localized N1s (e.g., [38,39]). Palmer and Harrison [40] note that in both man and in Guinea pig, it is the lowest portion of this CAP audiogram that provides the frequency range that dominates the click-evoked N1, regardless of click level. For man, the CAP audiogram is minimal near 4 kHz [40], and in fact Gaussian-shaped tone-pips of 3 ms produce bigger N 01 s for 4 kHz than for 2.140 kHz [41]. The dependence of the derived AP on the CAP audiogram can be explained in terms of two sources that increase the number of afferents contributing to the CAP with rise in click level; a widening of the contributing CF range (spread of excitation), and the accessing of higher-threshold afferents within any band (e.g., [24]). 4.2. The role of N1 variance Figs. 4 and 5 show that the empirical standard deviations do not vary as much as do the rates of growth of the mean N1, i.e., the rate of change of the sensory variable dominates the DL, contrary to Taub and Raab [16], who proposed that variability dominates. The level-dependence of rN 1 is not congruent to the level-dependence of mean N1; indeed, the relation of rN 1 to x appears to be unique to each of the recordings provided by Frishkopf [14]. Consider three sets of data that have been omitted from the present computations. A glance at FrishkopfÕs illustrations reveals that for Cat 341, location 1, rN 1 appears to be roughly constant, varying back and forth between 0.5 and 0.9 Ôarbitrary unitsÕ (the plotting units used by Frishkopf [14]) over a click-intensity range of about 90 dB. For location 2 in the same cat, rN 1 forms a lobe that starts as low as 0.5 units but that peaks at 2 units at a click level that is in the middle of the employed level range of 48 dB. For another location in another cat showing a mid-range knee in the N1 (Cat 34, location 3), rN 1 peaks at the mid-range knee, then falls away on either side, decreasing faster, however, at higher click levels than at lower click levels. Even here, rN 1 again falls well within one order of magnitude, ranging from 0.6 units to 1.4 units, a relatively narrow range. The computed level-dependences suggest that the empirical variability of rN 1 with x is insufficient to explain the overall pattern of discriminability. Indeed, it may be sufficient, for computations, to assume that rN 1 , expressed in decibels, is constant for the cat. The same applies to the Guinea pig, for which rN 1 increases only by about 3 dB over a click-level range of 60 dB [42]. If rN 1 is assumed constant, then according to Eq. (5) it influences the sheer size of the hump, but not its occurrence; it is primarily the slope of the mean N1 plot that determines the shape of the level-dependence of the DL. Thus, despite the correlation between standard deviation and DL found by Taub and Raab [16] in the Guinea pig, rN 1 cannot be the source of the mid-level hump in the Guinea pig, or in the cat.
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
25
4.3. The knee in N1 and the mid-level hump There is empirical evidence that the mid-level hump does indeed involve the initial burst underlying the N1. Some of that evidence was provided by investigators already referred to in the Introduction, and the details had been omitted for the sake of brevity. The evidence will not be held back further, however, because it is germane to all models of humps in the DL for very brief stimuli, and thus provides an introduction to the succeeding section on mid-level humps. The evidence for the involvement of the initial spike burst is as follows. When a click was presented in a continuous background of white noise, and when the noise level was progressively increased to 65 dB SPL, the position of the peak of the mid-level hump progressively moved upwards in level, and the peak value of the DL fell, rather than rising as might have been expected [2]. This white-noise paradigm had already been used in recording N1 from the cat [14] and from the Guinea pig [16], revealing that N1 amplitude changed in parallel to that of the click DL, that is, as noise level rose, absolute thresholds for the appearance of N1 rose, and N1 amplitudes at suprathreshold click levels fell. Presumably, the comparatively lower amplitude of the clickevoked N1 under noise reflects the lower probability of evoking the initial synchronous spike in any given neuron. Thus, that initial high-probability spike must be involved in the mid-level hump for the click. The pattern of behavior of the N1 evoked by a brief tone-pip in a continuous background noise is similar to that of the click. Spoor et al. [43] used a trapezoidal pulse composed of two cycles of the sinusoid for each of the ramps and six cycles for the plateau, for a total of 1.667 ms at 6 kHz (for Guinea pigs) and 2.5 ms at 4 kHz (for human subjects). In both species, an increase in masking noise required an increase in the level of the tone-pip in order to maintain any chosen N1 amplitude that was within the normal range of amplitudes of the non-masked pip. That is, as noise level rose, N1 amplitudes at suprathreshold tone-pip levels fell, as for clicks. Thus, the initial high-probability spike that creates the N1 must be involved in the mid-level hump for very brief tone-pips as well as for clicks. We may expect a mid-level hump for any stimulus that generates the spike activity that underlies a mean N1 curve having a mid-range knee. This conclusion follows from the present computations, which suggest that the mean N1 (not the N1 variance) is the principle determinant of the DL for clicks, and that the knee in the mean N1 curve corresponds to the peak of the hump in the DL. The origin of the knee itself has been explained by Antoli-Candela and Kiang [9], who made elegant illustrations showing how the spikes represented in post-stimulus-time histograms (PSTHs) add to produce the CAP, especially as higher-threshold primary afferents are successively activated with rise in click level. Antoli-Candela and Kiang [9] noted that at moderate click levels, spikes from click-activated neurons can add in such a way as to split the N1 in two, reducing the amplitude of the measurement called N1. N1 can level off or even decrease with increase in click level, while the number of neurons contributing elemental waveforms to the overall CAP continues to increase. (As Charlet de Sauvage et al. [42] concluded from systematic measurements of click-evoked N1 means and variances, ‘‘at each intensity level the same number of fibers is newly recruited’’.) The phenomenon observed by Antoli-Candela and Kiang [9] depends on both the relative sizes of the observed histogram peaks across histograms, and on the relative latencies of those features across histograms. Latency is intimately tied to N1, so that any discussion of N1 must address latency. When N1 levels off, latency should level off. In fact, when the curve of mean
26
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
N1 amplitude vs. click level shows a knee or dip, so does the curve of mean N1 latency vs. click level (e.g., [15], cat; [29], Guinea pig). The latency of the overall N1 depends on the latencies of the derived APs, latencies which change with level as well as with frequency ([44, Fig. 3]; [45, Fig. 3]). Conceivably the latencies of the derived APs offset each other with rising click level, causing overall N1 latency to remain constant or even fall before climbing once again. A very brief tone-pip is click-like, in that its fast rise time evokes a brief burst of synchronized spikes across some population of neurons, in this case, those centered at the cochlear location of the neuron whose CF equals the toneÕs frequency [12,31,46]. The N1 evoked by a tone-pip mimics the N1 evoked by a click, in terms of its shape and growth as a function of stimulus level, and its construction from local components across frequencies, except that a narrower frequency range is used than for a click [12,31,46]. Consequently, the mean N1 curves for tone-bursts should have the same shape as that for the click. The latter shows a knee, hence so should the mean N1 curve evoked by the tone bursts, a congruency confirmed for cat [10] and monkey [47]. As noted above, any stimulus that produces a mean N1 curve that shows a knee should produce DLs showing a mid-level hump, hence the DLs for very brief tone-pips should display a mid-level hump. The next section confirms that inference. 4.4. DL humps and the models for them Shortening a tone generally broadens its effective spectrum, making it more click-like. Thus, some might argue, it would hardly be surprising if the mean N1 curve for a tone-burst mimicked that for the click. Indeed, the two curves would be expected to merge as the tone-burst was shortened and its effective spectrum therefore widened; for a tone burst with a broad effective spectrum to produce DLs that follow a mid-level hump would prove nothing new. A midlevel hump could only be revelatory for the tone-burst having the narrowest possible effective energy spectrum. The spectrum can be restricted to a single significant lobe centered at the toneÕs frequency by making the tone Gaussian-shaped. Also, a Gaussian envelope minimizes a quantity whose parts trade off for any ramping scheme, the product of the effective spectrum and effective duration [48]. The word ÔeffectiveÕ is used because a true Gaussian curve, like some other smooth envelopes, will taper off to infinity. This creates a problem of specifying the stimulusÕ duration. This problem has prompted the use of a common duration measure, the equivalent rectangular duration ÔDÕ, defined as the duration of a rectangular envelope that encloses the same area as the smoothpenvelope (e.g., [49]). For Gaussian-shaped smooth envelopes with ffiffiffiffiffiffi standard deviation r, D ¼ r 2p. Gaussian envelopes were used by Nizami and coworkers [49– 52] and those pffiffiffiffiffiffi envelopes were reduced to zero at ±4 standard deviations, giving durations of 8r ¼ 8D= 2p 3.19D [49]. DLs have been reported by Nizami and colleagues as a function of level for the 2 kHz Gaussian-shaped tone-pip of D = 1.25 ms, when each tone-pip was preceded by a weak 2 kHz forwardmasker [50], and for the control condition, without a forward masker [51]. In both cases, the DLs resemble those found for 0.1 ms clicks [2], i.e., the forward masker made no difference. Later, Nizami et al. [49] obtained DLs for Gaussian-shaped 2 kHz tone-pips presented in quiet for D = 1.25, 2.51, or 10.03 ms. For Ds of 1.25 ms and 2.51 ms there were mid-level humps; the DLs for 10.03 ms followed the near-miss to WeberÕs law. Later yet, Nizami et al. [52] once again obtained DLs at 2 kHz in quiet, but only for D = 2.51 ms. They again found the mid-level hump.
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
27
Nizami et al. [49] also obtained DLs for Gaussian-shaped packets of broadband random noise, of D = 0.63–10.03 ms. The rationale for using noise was that the DLs for those stimuli should be determined by duration alone, without the significant contribution to the energy spectrum from the increase in effective bandwidth that occurs as the 2 kHz tone-pip is progressively shortened, even when it is Gaussian-shaped. The DLs for the noise-pip of D = 0.63 ms essentially replicated the DLs for clicks [2]. A smaller hump in DL vs. SPL occurred for D = 1.25 ms, and a yet smaller hump was seen for D = 2.51 ms. For Ds of 5.02 and 10.03 ms, the plot of DL vs. SPL followed WeberÕs Law, that is, had a slope statistically indistinguishable from zero, as known for longer noise durations [4]. Thus duration, not bandwidth, appears to be the key to the mid-level hump. This notion can be integrated with RadionovaÕs statement [6] that the DL is determined by ‘‘the mechanism based on the number of elements that respond synchronously to the signal’’ (see Section 1), as follows. Nizami et al. [49] emphasized the importance of stimulus duration in a conceptual model of their findings for both noises and for tones. They first assumed that level is encoded by spike count, regardless of stimulus duration. They then postulated that the neuronÕs spike train has two functional parts. The first functional part, following Radionova [6], is the brief initial burst of spikes, which, when coincident across-neurons, evokes the N1 potential [7,8]. Human subjects do produce an N1, whose amplitude generally increases with stimulus level, in response to Gaussian-shaped tone-pips of a frequency and duration similar to those used by Nizami et al. [49,52] (2.140 kHz, 3 ms actual duration [41]). Nizami et al. [49] then postulated, again following Radionova [6], that the observed mid-level humps derive from the use of the initial coordinated spike burst as a level code. Nizami et al. then departed from Radionova by noting that when the stimulus is longer than a click, there is a second possible cue to level, viz., those spikes that, excluding the initial burst, last as long as the stimulus itself. That sustained spike train, by hypothesis, underlay the near-miss to WeberÕs law for the 2 kHz tone-pip, and underlay WeberÕs law for the noise-pip. That is, the observed psychophysical behavior, whether a mid-level hump, WeberÕs law, or some departure from it, is determined by the relative contributions of the initial spike burst and the sustained spike train [49,52]. Notwithstanding the notion that duration, not bandwidth, is crucial to the mid-level hump, tone frequency still plays a role. That is, the near-miss to WeberÕs law is not universal for D > 10 ms. For tones of 6.5 kHz having 26 ms plateaus and 5 ms raised-cosine ramps, or 16 ms plateaus and 5 ms raised-cosine ramps [53], there is a rise in the DL at moderate levels, dubbed the Ôsevere departure from WeberÕs lawÕ [53]. The Ôsevere departureÕ was later confirmed for 16 ms plateaus and 5 ms raised-cosine ramps [54] and for 25 ms plateaus and 2 ms raised-cosine ramps [55]. For a Ôsevere departureÕ, D need not even be as short as 26 ms; a mid-level rise in the DL also occurs for 500 ms tones in the 8–10 kHz range [56]. Nizami et al. chose the expression Ômid-level humpÕ to distinguish their findings from the Ôsevere departureÕ, because the latter was found for greater durations and a much higher frequency. Notwithstanding, the two sorts of humps can be comparable in size, which at first implies that the Nizami et al. model must be modified to be frequency-dependent.5 5
For the Carlyon and Moore tones of 16 ms steady-state duration and 5 ms onset- and offset-ramps [53], the peak of the hump averaged 5.21 dB at 6.5 kHz and 55 dB SPL (these values are in 10 log(1 + [DI/I]), the DL scale used here, converted from 10 log(DI/I) [53]). These numbers compare well to the peak value of roughly 5.5 dB at 50 dB SPL for the 2 kHz Gaussian-shaped tone-pip of D = 2.51 ms [49].
28
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
So how does the Nizami et al. model deal with the Carlyon et al. data? Carlyon et al. used stimuli having 5 ms onset- and offset-ramps for most of their measurements. According to Goldstein and Kiang [7], the 5 ms ramps are long enough to eliminate the N1 potential and, by inference, any cues that might arise from the coordinated across-neurons burst represented by the N1. Caution is advised because Goldstein and Kiang [7] employed broadband noise, not pure tones; for pure tones, ramps of fixed duration are effectively more gradual at higher frequencies, so that the influence of the initial burst might have been even weaker than expected, which would explain why the DLs found by Carlyon and colleagues were not bigger than observed. According to the Nizami et al. model, little or no initial-burst coding means that the Ôsevere departure from WeberÕs lawÕ found by Carlyon and colleagues likely relies on the sustained spike train, rather than on the initial spike burst presently proposed to account for the 2 kHz mid-level hump [49]. But, again according to the model [49], shortening the duration must eventually affect the DL, regardless of the frequency composition of the stimulus. That is, as the 6.5 kHz tone becomes short enough to evoke the synchronous spike burst that underlies the N1, the influence of the coordinated initial burst will gradually overpower that of the sustained spike train, and the Ôsevere departureÕ will become even more severe. And that is what happens; the Ôsevere departureÕ does indeed enlarge significantly as the Gaussian-shaped 6.5 kHz tone-pip is shortened from D = 30 ms to D = 0.632 ms [57,58]. Thus, the Nizami et al. model must include the notion that the sustained spike train affects the DL in a frequency-dependent fashion. Carlyon et al. [53–55] found the Ôsevere departureÕ for D 16–26 ms at 6.5 kHz. Zeng et al. [59], however, noted that the DL will rise profoundly at moderate levels for tones of D 25 ms at the much lower frequency of 1 kHz, when the tone is preceded by a stimulus (the Ôforward maskerÕ) whose frequency spectrum contains the frequency of the tone (confirmed in [60,61]). Zeng et al. presumed an explanation in the observation that tone-evoked afferent firing is diminished by the forward masker compared to that evoked by a tone in quiet [62–64]. Post-masker, the spikes evoked by the tone gradually recover to what is seen in quiet, but recovery proceeds at a different pace for afferents from different spontaneous rate groups [64]. Zeng et al. [59] used probe-tone delays of 100 ms, and others have followed suit. Zeng et al. noted that at 100 ms post-masker, high-spontaneous-rate neurons would have recovered from forward masking, but would fire at their saturation firing rates in response to a mid-level probe tone. In contrast, low-spontaneous-rate neurons would not be recovered, their useful encoding ranges still shifted upwards in the intensity dimension. Consequently, there would be less sensitivity to changes in probe level at moderate probe levels as at other probe levels, thus raising the DL. However, the Zeng et al. model has a special problem that has important implications for the interpretation of mid-level humps. When the masker follows the probe (Ôbackward maskingÕ), the DL rises profoundly [65], an observation confirmed for contralateral (as well as ipsilateral) backward-maskers [61]. A peripheral interaction such as that proposed by Zeng et al. is unlikely in backward masking. Rather, backward masking implies a central interaction, within some time window, of the internal representations of the probe-tone and of the backward-masker. A central interaction is also implied by inflation of the DL when the forward masker appears contralaterally to the probe (e.g., [61,66,67]), as the left- and right-ear inputs do not connect anatomically below the brainstem [68]. The hump in the DL under forward- or backward-masking appears to have little in common with the hump seen by Avakyan and Radionova [1] and later by Raab and Taub [2], but there
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
29
is more to the story. In general, it is assumed that intensity discrimination depends on comparison of memory traces [55]. If the forward or backward masker interferes with the memory trace of the probe tone, then the Ôcentral effectsÕ on the DL are explained [55]. Durlach and Braida [69] described an alternative coding strategy based on context coding, which would generally be less accurate than comparison of probe-evoked memory traces, but which would work better when the memory traces were weak. Context coding is presumably better in the proximity of perceptual anchors [70], such as the top or bottom of the stimulus range, so that the interference with the probe-evoked memory trace, i.e., the interference caused by forward or backward masking, would be lessened for probe tones presented at low or at high levels, where the subjects could rely on context coding. However, the DL would be elevated at moderate levels. Thus central factors may play a role in the level-dependence of the DL. This raises the possibility that there is a central factor common to all mid-level humps. The mid-level rise in the DL in the absence of forward or backward masking has been consistently credited to peripheral factors, whereas only central factors can explain the hump seen at 1 kHz under forward or backward masking. Both types of elevation of the DL could involve summing of spikes at some central point, in order that encoding of intensity and intensity change can be based on spike counts from more than one neuron. Spike pooling seems necessary, because the discriminability offered by single primary afferent neurons is insufficient to account for psychophysical performance (e.g., [24,71]). Convergence of neuronal firing at higher loci is also implied by the existence of critical bands. A variety of putative collector neurons have been identified (reviewed in [72]). If the mid-level hump due to shortening a tone is peripheral in origin, and if the mid-level hump due to forward masking is central in origin, implying mutual independence, then the imposition of a forward masker should cause further inflation in the size of the DL hump known for the tonepip in quiet, a hump such as the one consistently seen for the 2 kHz Gaussian-shaped tone-pip of D = 2.51 ms [49,52]. To investigate, Nizami et al. [52] used a 200 ms 2 kHz sinusoid to forwardmask that tone pip. The masker-level/probe-delay conditions used were 50 dB SPL/10 ms, 50 dB SPL/100 ms, and 70 dB SPL/100 ms. Analysis of variance confirmed that the DLs inflated significantly under all conditions. In sum, there are three postulated mechanisms of a mid-level rise in the stimulus-level-dependence of the DL: a peripheral effect due to shortening of any stimulus, a frequency effect that is presumed to also be peripheral in origin, and a central interference in the memory trace, in forward- or backwardmasking. These mechanisms appear to be mutually independent and complementary. 4.5. The possible role of the cochlear non-linearity For moderate levels of a pure tone, the RMS vibration amplitude in dB at the cochlear locus of the CF does not increase in proportion to the RMS stimulus amplitude in dB SPL. Rather, vibration grows at <1 dB per dB of level [73,74], a phenomenon called the cochlear non-linearity. The actual relation can be as low as 0.2 dB/dB in the basal cochlear turn of chinchilla, Guinea pig, and cat. Below moderate tone levels, the relation is 1 dB/dB. Above moderate tone levels, the relation is unclear; in some cases, the relation grows as SPL rises above 80 dB SPL, approaching 1 dB/dB at 90–100 dB SPL. That ÔlinearityÕ is predicted by theory [73,74]. However, vibration has been observed to grow at <1 dB/dB right up to 100 dB SPL. At low SPLs, it is unclear where the ÔcompressiveÕ zone of <1 dB/dB actually starts, the starting point varying from roughly 30 dB
30
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
SPL to 50 dB SPL, perhaps according to species. One thing is generally agreed upon: that nonlinearity is only present at or near the cochlear place of the CF. How the cochlea behaves offCF is also agreed upon, but only for high tone frequencies, for which vibration follows 1 dB/ dB for cochlear loci corresponding to <0.7CF. Cochlear non-linearity was central to a model of Heinz et al. [72], who assumed that the cochleaÕs compressive zone is 30–120 dB SPL, the actual relation of vibration amplitude to stimulus amplitude, in dB/dB, assumed to be tone-frequency-dependent. Empirically, when cochlear displacement is non-linear, so is the level-dependence of the displacement phase. The possible role of non-linear phase changes in determining the DL was the focus of the Heinz et al. [72] model.6 The Heinz et al. model is worthy of attention because it starts with cochlear vibration characteristics and ends with the near-miss and the severe departure, as follows. Heinz et al. proposed that phase cues are decoded using monaural, across-frequency coincidence detectors, each innervated by two afferents. When the two afferents both contribute spikes within some fixed time delay, the detector fires a spike. The output spike count of the detector must be affected by the average firing rates, the phases, and the degree of interspike synchrony of each input spike train. The entire population of auditory primary afferents was simulated using 120 model CFs, with four coincidence detectors for each CF. Each afferent is assumed to innervate only one detector, which simplifies the calculations by allowing the afferents to be treated as mutually independent. Besides coincidence counting, two other computations were done, an ideal-detection computation which made optimal use of all average-rate, phase, and synchrony cues in the spike train, and another computation that only used firing rates, but in an optimal fashion. According to Heinz et al. [72], their model does not account for the decline in spike rate during exposure to the tone and is thus only appropriate for long tones. They obtained predicted DLs for 500 ms tones of 0.996 and 9.874 kHz. All three kinds of computations produced the Ônear-miss to WeberÕs lawÕ for 0.996 kHz, and the Ôsevere departure from WeberÕs lawÕ for 9.874 kHz. The latter DLs compared well to those of Florentine et al. [56], but only for the coincidence detector, whose DLs were 1–2 orders of magnitude larger than those from the other two kinds of computations. Heinz et al. concluded that the Ôsevere departureÕ results from cochlear compression at high frequencies, and that the Ônear-missÕ reflects a lack of compression at low frequencies. The prediction of the Ôsevere departureÕ for tone DLs from consideration of cochlear motion suggests, at first, that a similar treatment for clicks might produce the mid-level hump. In the squirrel monkey, Guinea pig, and chinchilla, the features that characterize steady-state vibrations to tones at the base of the cochlea have also been seen in the vibrations evoked by clicks [73,74]. In view of that gross similarity, the Heinz et al. modelÕs prediction of the Ôsevere departureÕ for 9.874 kHz suggests that when all places along the cochlea are vibrating as at CF, as they presumably do for the click, then the compressive non-linearity believed to exist for most CFs might
6
According to Heinz et al. [72], non-linear phase changes are seen in the vibration of the BM at high frequencies, and in inner-hair-cell responses and single-neuron responses at low frequencies, and have a wide dynamic range, such that they continue to encode changes in stimulus level even at high levels. Phase changes differ in any two afferents for adjacent CFs, yielding a relative phase difference which changes with level regardless of stimulus-tone phase. That is, with increase in level, the response phase lags up to 90° for frequencies below the CF, the response phase leads up to 90° for frequencies above the CF, and the response phase does not change at CF. The maximum phase shift occurs about halfway into the non-linear frequency range around the CF.
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
31
cause the hump in the plot of DL vs. click level. However, Heinz et al. ignored an upturn towards linearity at high SPLs that is seen for both Guinea pigs and chinchillas [74] and that is predicted by theory [75]. Theory also predicts a near-plateau in the growth of cochlear vibration between 63 and 75 dB SPL [75]. Cochlear recording continues to involve technical challenges that prevent such predictions from being denied.7 It is not at all clear how a plateau or an upturn would affect the predictions of the Heinz et al. model. The present model infers the DL from the N1 (Eqs. (5)–(7)). The N1 depends upon the spikes evoked by the click or the very brief tone-pip, and so inherently accounts for the non-linear cochlear mechanical response to the click or the very brief tone-pip. The graphical model of AntoliCandela and Kiang [9], that explains the knee in the mean N1 curve, is based on actual afferent firing and hence it, too, inherently takes any cochlear non-linearity into account. A flattening of the cochlear vibration-amplitude curve as predicted by theory [75] must reduce the rate of increase of the number of evoked spikes, hence slowing the growth of the N1 curve. But according to Antoli-Candela and Kiang [9], not all of the initial synchronous click-evoked spikes are represented in the N1 curve at and above the knee, thanks to interference of one spike train with another. Thus the slowdown in the growth of N1 should be more profound than expected, perhaps being ÔcompressiveÕ enough to create the observed knee (e.g., [6,10,14,47]), or even the observed dip [14,15], in the mean N1 curve. Finally, any model that hopes to use cochlear non-linearity to infer the mid-level hump in the DL, with or without reference to N1, will have to explain two statistically significant observations: first, the duration-dependence of the size of the hump for very brief Gaussian-shaped tone-pips and noise-pips [49,57,58], and second, the existence of a hump for Gaussian-shaped 500 Hz tone-pips of durations in the range D = 5.02–8.78 ms [57,58], that is, a mid-level hump at a frequency at which cochlear vibration is assumed to be linear rather than compressive.
5. Summary and conclusions The intensity-difference limens (DLs) for clicks show a mid-level hump not seen for longer stimuli. Radionova [6] and Taub and Raab [16] had related that hump to the N1 potential. Here, the N1 is related to the DL through Signal Detection Theory, which produces an infinite series in DL and N1. Truncation produces the DL as a first-order or second-order equation of N1, into which the empirical click-evoked N1s of Frishkopf [14] were substituted. The resulting computed DLs showed a mid-level hump; at its peak, the hump was smaller for the second-order equation,
7 Measurement of cochlear vibration is dogged by technical problems both general and particular [73,74]. In general, opening of the cochlea causes an immediate loss of sensitivity, and it may be impossible to record without trauma. In particular, in recordings from the cochlear base (i.e., the majority of cochlear recordings), the cochlea is approached from the basilar membrane side rather than the reticular lamina side. But as noted in one review, ‘‘As the active properties of the cochlea are generally considered to originate at the level of the outer hair cells, it follows that these features of the response will be most clearly observed when measurements are performed at the reticular lamina as compared to the basilar membrane’’, the latter being ‘‘structurally (and functionally) relatively far from the mechanoelectrical transduction sites at the level of the reticular lamina’’ [74]. Even for recordings from the reticular lamina, the amplitude of response varies by many dB depending on the radial location of the recording site [74].
32
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
and of the same order of magnitude as the actual DLs. Computationally, the rate-of-growth of the mean N1, not its variability as previously claimed [16], was the main determinant of the mid-level hump. Acknowledgment Dr. Walt Jesteadt provided partial support (R01 DC00136). Richard Tychansky of the University of Toronto in Mississauga (Erindale College) painstakingly digitized the data of Frishkopf [14]. My special thanks to the anonymous reviewers for forcing clarity from obfuscation. References [1] R.V. Avakyan, E.A. Radionova, The special features of differential intensity thresholds for a brief sound signal, Sov. Phys. Acoust. 8 (1963) 320. [2] D.H. Raab, H.B. Taub, Click-intensity discrimination with and without a background masking noise, J. Acoust. Soc. Amer. 46 (1969) 965. [3] H.B. Taub, Click-intensity discrimination in relation to the statistics of the N1 response, Doctoral dissertation, City University of New York, 1969. [4] G.A. Miller, Sensitivity to changes in the intensity of white noise and its relation to masking and loudness, J. Acoust. Soc. Amer. 19 (1947) 609. [5] W. Jesteadt, C.C. Wier, D.M. Green, Intensity discrimination as a function of frequency and sensation level, J. Acoust. Soc. Amer. 61 (1977) 169. [6] E.A. Radionova, Measuring the intensity of a brief sound signal at the first neuron level of the auditory system, Sov. Phys. Acoust. 8 (1963) 350. [7] M.H. Goldstein, N.Y.S. Kiang, Synchrony of neural activity in electric responses evoked by transient acoustic stimuli, J. Acoust. Soc. Amer. 30 (1958) 107. [8] N.Y.S. Kiang, E.C. Moxon, A.R. Kahn, The relationship of gross potentials recorded from the cochlea to single unit activity in the auditory nerve, in: R.J. Ruben, C. Elberling, G. Salomon (Eds.), Electrocochleography, University Park, Baltimore, 1976, p. 95. [9] F. Antoli-Candela, N.Y.S. Kiang, Unit activity underlying the N1 potential, in: R.F. Naunton, C. Fernandez (Eds.), Evoked Electrical Activity in the Auditory Nervous System, Academic, New York, 1978, p. 165. [10] E. de Boer, Synthetic whole-nerve action potentials for the cat, J. Acoust. Soc. Amer. 58 (1975) 1030. [11] C. Elberling, Simulation of cochlear action potentials recorded from the ear canal in man, in: R.J. Ruben, C. Elberling, G. Salomon (Eds.), Electrocochleography, University Park, Baltimore, 1976, p. 151. [12] R. Charlet de Sauvage, J.-M. Aran, J.-P. Erre, Mathematical analysis of VIIIth nerve cap with a linearly-fitted experimental unit response, Hear. Res. 29 (1987) 105. [13] V.F. Prijs, Single-unit response at the round window of the Guinea pig, Hear. Res. 21 (1986) 127. [14] L.S. Frishkopf, A probability approach to certain neuroelectric phenomena, M.I.T. Res. Lab. Electron. Tech. Rep. 307 (1956) 1. [15] W.T. Peake, N.Y.S. Kiang, Cochlear responses to condensation and rarefaction clicks, Biophys. J. 2 (1962) 23. [16] H.B. Taub, D.H. Raab, Fluctuations of N1 amplitude in relation to click-intensity discrimination, J. Acoust. Soc. Amer. 46 (1969) 969. [17] L. Nizami, The level-dependence of intensity-difference limens for very brief stimuli, inferred from the N1 potential in the cat, Abs. ARO 23 (2000) 27. [18] M.C. Teich, S.M. Khanna, Pulse-number distribution for the neural spike train in the catÕs auditory nerve, J. Acoust. Soc. Amer. 77 (1985) 1110. [19] E.M. Relkin, D.G. Pelli, Probe tone thresholds in the auditory nerve measured by two-interval forced-choice procedures, J. Acoust. Soc. Amer. 82 (1987) 1679.
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
33
[20] M.C. Liberman, Auditory-nerve response from cats raised in a low-noise chamber, J. Acoust. Soc. Amer. 63 (1978) 442. [21] J.P. Egan, Signal Detection Theory and ROC Analysis, Academic, New York, 1975. [22] C. Kaernbach, Poisson signal-detection theory: link between threshold models and the Gaussian assumption, Percept. Psychophys. 50 (1991) 498. [23] W.S. Hellman, R.P. Hellman, Intensity discrimination as the driving force for loudness. Application to pure tones in quiet, J. Acoust. Soc. Amer. 87 (1990) 1255. [24] L. Nizami, B.A. Schneider, Auditory dynamic range derived from the mean rate-intensity function in the cat, Math. Biosci. 141 (1997) 1. [25] L. Nizami, Estimating auditory neuronal dynamic range using a fitted function, Hear. Res. 167 (2002) 13. [26] L.S. Frishkopf, W.A. Rosenblith, Fluctuations in neural thresholds, in: H.P. Yockey, R.L. Platzman, H. Quastler (Eds.), Symposium on Information Theory in Biology, Pergamon, New York, 1958, p. 153. [27] J.J. Eggermont, D.W. Odenthal, Action potentials and summating potentials in the normal human cochlea, Acta Otolaryngol. Supp. 316 (1974) 39. [28] J.C. Saunders, Behavioral discrimination of click intensity in cat, J. Exp. Anal. Behav. 12 (1969) 951. [29] D.C. Teas, D.H. Eldredge, H. Davis, Cochlear responses to acoustic transients: an interpretation of whole-nerve action potentials, J. Acoust. Soc. Amer. 34 (1962) 1438. [30] D.E. Crowley, V.L. Schramm, R.E. Swain, S.N. Swanson, Analysis of age-related changes in electric responses from the inner ear of rats, Ann. Otol. 81 (1972) 739. [31] J.J. Eggermont, Analysis of compound action potential responses to tone bursts in the human and Guinea pig cochlea, J. Acoust. Soc. Amer. 60 (1976) 1132. [32] S. Zerlin, R.F. Naunton, Whole-nerve response to third-octave audiometric clicks at moderate sensation level, in: R.J. Ruben, C. Elberling, G. Salomon (Eds.), Electrocochleography, University Park, Baltimore, 1976, p. 199. [33] J.-M. Aran, Y. Cazals, Electrocochleography: animal studies, in: R.F. Naunton, C. Fernandez (Eds.), Evoked Electrical Activity in the Auditory Nervous System, Academic, New York, 1978, p. 239. [34] G.R. Price, Action potentials in the cat at low sound intensities: thresholds, latencies, and rates of change, J. Acoust. Soc. Amer. 64 (1978) 1400. [35] E.F. Evans, C. Elberling, Location-specific components of the gross cochlear action potential, Audiology 21 (1982) 204. [36] D.F. Dolan, D.C. Teas, J.P. Walton, Relation between discharges in auditory nerve fibers and the wholenerve response shown by forward masking: an empirical model for the AP, J. Acoust. Soc. Amer. 73 (1983) 580. [37] J.R. Johnstone, Origin of the Guinea pig cochlear action potential produced by a click, Hear. Res. (1981) 347. [38] P. Dallos, D. Harris, O. Ozdamar, A. Ryan, Behavioral, compound action potential, and single unit thresholds: relationship in normal and abnormal ears, J. Acoust. Soc. Amer. 64 (1978) 151. [39] J.R. Johnstone, V.A. Alder, B.M. Johnstone, D. Robertson, G.K. Yates, Cochlear action potential threshold and single unit thresholds, J. Acoust. Soc. Amer. 65 (1979) 254. [40] A.R. Palmer, R.V. Harrison, Suppression by tones of the click evoked compound action potential in the normal and pathological Guinea-pig cochlea and in man, Scand. Audiol. 14 (1985) 67. [41] M. Hoke, Influence of certain stimulus parameters on the compound action potential, as demonstrated at normal subjects and in some pathological cases, Rev. Laryngol. (Bordeaux) 95 (1974) 508. [42] R. Charlet de Sauvage, Y. Cazals, J.-M. Aran, The variability of single click evoked CAPs in the Guinea pig as a function of stimulus intensity and polarity, Scand. Audiol. Supp. 9 (1979) 167. [43] A. Spoor, J.J. Eggermont, D.W. Odenthal, Comparison of human and animal data concerning adaptation and masking of eighth nerve compound action potential, in: R.J. Ruben, C. Elberling, G. Salomon (Eds.), Electrocochleography, University Park, Baltimore, 1976, p. 183. [44] R.F. Naunton, S. Zerlin, Human whole-nerve response to clicks of various frequency, Audiology 15 (1976) 1. [45] V.F. Prijs, J.J. Eggermont, Narrow-band analysis of compound action potentials for several stimulus conditions in the Guinea pig, Hear. Res. 4 (1981) 23. [46] O. Ozdamar, P. Dallos, Synchronous responses of the primary auditory fibers to the onset of tone burst and their relation to compound action potentials, Brain Res. 155 (1978) 169.
34
L. Nizami / Mathematical Biosciences 197 (2005) 15–34
[47] J.E. Pugh Jr., M.R. Horwitz, D.J. Anderson, Cochlear electrical activity in noise-induced hearing loss, Arch. Otolaryngol. 100 (1974) 36. [48] D. Gabor, Theory of communication, J. Inst. Elec. Eng. (Part III) 93 (1946) 429. [49] L. Nizami, J.F. Reimer, W. Jesteadt, The intensity-difference limen for Gaussian-enveloped stimuli as a function of level: tones and broadband noise, J. Acoust. Soc. Amer. 110 (2001) 2505. [50] L. Nizami, B.A. Schneider, Forward-masked intensity increment thresholds at two recovery times, J. Acoust. Soc. Amer. 96 (1994) 3280. [51] L. Nizami, On auditory dynamic range, Doctoral dissertation, University of Toronto, 1999. [52] L. Nizami, J.F. Reimer, W. Jesteadt, The mid-level hump at 2 kHz, J. Acoust. Soc. Amer. 112 (2002) 642. [53] R.P. Carlyon, B.C.J. Moore, Intensity discrimination: a severe departure from WeberÕs law, J. Acoust. Soc. Amer. 76 (1984) 1369. [54] R.P. Carlyon, B.C.J. Moore, Continuous versus gated pedestals and the Ôsevere departureÕ from WeberÕs law, J. Acoust. Soc. Amer. 79 (1986) 453. [55] R.P. Carlyon, H.A. Beveridge, Effects of forward masking on intensity discrimination, frequency discrimination, and the detection of tones in noise, J. Acoust. Soc. Amer. 93 (1993) 2886. [56] M. Florentine, S. Buus, C.R. Mason, Level discrimination as a function of level for tones from 0.25 to 16 kHz, J. Acoust. Soc. Amer. 81 (1987) 1528. [57] L. Nizami, W. Jesteadt, Intensity-difference limens for very brief Gaussian-shaped tones at 500 Hz and 6.5 kHz, J. Acoust. Soc. Amer. 111 (2002) 2338. [58] L. Nizami, W. Jesteadt, The mid-duration hump in the intensity-difference limen as a function of frequency: further evidence for the frequency-time listening window of the human ear, J. Acoust. Soc. Amer. 113 (2003) 2197. [59] F.-G. Zeng, C.W. Turner, E.M. Relkin, Recovery from prior stimulation II: effects upon intensity discrimination, Hear. Res. 55 (1991) 223. [60] F.-G. Zeng, C.W. Turner, Intensity discrimination in forward masking, J. Acoust. Soc. Amer. 92 (1992) 782. [61] C.J. Plack, R.P. Carlyon, N.F. Viemeister, Intensity discrimination under forward and backward masking: role of referential coding, J. Acoust. Soc. Amer. 97 (1995) 1141. [62] D.M. Harris, P. Dallos, Forward-masking of auditory nerve fiber responses, J. Neurophysiol. 42 (1979) 1083. [63] R.L. Smith, Short-term adaptation in single auditory nerve fibers: some poststimulatory effects, J. Neurophysiol. 40 (1977) 1098. [64] E.M. Relkin, J.R. Doucet, Recovery from prior stimulation. I: Relationship to spontaneous firing rates of primary auditory neurons, Hear. Res. 55 (1991) 215. [65] C.J. Plack, N.F. Viemeister, Intensity discrimination under backward masking, J. Acoust. Soc. Amer. 92 (1992) 3097. [66] F.-G. Zeng, R.V. Shannon, Possible origins of the non-monotonic intensity discrimination function in forward masking, Hear. Res. 82 (1995) 216. [67] R.S. Schlauch, B.R. Clement, D.T. Ries, J.J. DiGiovanni, Masker laterality and cueing in forward-masked intensity discrimination, J. Acoust. Soc. Amer. 105 (1999) 822. [68] D.R.F. Irvine, The auditory brainstem. Chapter 5. Superior olivary complex: anatomy and physiology, Prog. Sens. Physiol. 7 (1986) 79. [69] N.I. Durlach, L.D. Braida, Intensity perception. I. Preliminary theory of intensity resolution, J. Acoust. Soc. Amer. 46 (1969) 372. [70] L.D. Braida, J.S. Lim, J.E. Berliner, N.I. Durlach, W.M. Rabinowitz, S.R. Purks, Intensity perception. XIII. Perceptual anchor model of context coding, J. Acoust. Soc. Amer. 76 (1984) 722. [71] L. Nizami, Dynamic ranges of auditory afferents: little difference between sloping-saturating and sigmoidal units, Soc. Neurosci. Abs. 24 (1998) 901. [72] M.G. Heinz, H.S. Colburn, L.H. Carney, Rate and timing cues associated with the cochlear amplifier: level discrimination based on monaural cross-frequency coincidence detection, J. Acoust. Soc. Amer. 110 (2001) 2065. [73] L. Robles, M.A. Ruggero, Mechanics of the mammalian cochlea, Physiol. Rev. 81 (2001) 1305. [74] M. Ulfendahl, Mechanical responses of the mammalian cochlea, Prog. Neurobiol. 53 (1997) 331. [75] R.S. Chadwick, Compression, gain, and nonlinear distortion in an active cochlear model with subpartitions, Proc. Natl. Acad. Sci. USA 95 (1998) 14594.