Perception of pitch by goldfish

Perception of pitch by goldfish

Hearing Research 205 (2005) 7–20 www.elsevier.com/locate/heares Perception of pitch by goldWsh Richard R. Fay ¤ Parmly Hearing Institute and Depart...

458KB Sizes 1 Downloads 87 Views

Hearing Research 205 (2005) 7–20 www.elsevier.com/locate/heares

Perception of pitch by goldWsh Richard R. Fay

¤

Parmly Hearing Institute and Department of Psychology, Loyola University Chicago, 6525 N. Sheridan Road, Chicago, IL 60626, USA Received 16 November 2004; accepted 15 February 2005 Available online 11 April 2005

Abstract Classical conditioning and stimulus generalization methods have revealed much about the sense of hearing in non-human animals, and are now used here to investigate how goldWsh perceive a variety of complex sounds, including multi-harmonic complexes and rippled noise (RN). In several experiments, animals were conditioned to respond to one type of complex sound, and were then tested for generalization to other sounds diVering along one or more acoustic dimensions from the conditioning sounds. Overall, generalization occurred only to the extent that the conditioning and test sounds were essentially similar in spectral range and, in most cases, waveform periodicity. For example, goldWsh showed inverted V-shaped generalization gradients to harmonic complexes varying in fundamental frequency after conditioning to complexes having a fundamental frequency of 100 Hz. In several cases, similar gradients were observed whether the fundamental frequency component was present or absent in conditioning and testing complexes, indicating that goldWsh, like other vertebrate listeners, do not “miss the fundamental” when it is missing. This generalization pattern tended to disappear when harmonic complexes were used that had random phase relations among the components, or slight mistuning of all components. In a few cases, patterns of generalization were determined by as yet unidentiWed acoustic features. GoldWsh did not generalize to RN or harmonic complexes after conditioning to tones, and vice versa, in spite of the three signal types having fundamental frequency components and periodicity in common. Moreover, goldWsh did not generalize robustly to inWnitely iterated rippled noise after conditioning to harmonic complexes with a prominent periodic envelope, and vice versa, in spite of the two signal types having similar spectra and pitches as judged by human listeners. These and other results suggest that the pitch of harmonic complexes is prominent in goldWsh generalization behavior and that this pitch-like dimension arises primarily from the signal’s periodicity. The perceptions of single tones, RNs, and harmonic complexes having the same fundamental frequency are fundamentally diVerent. It is concluded that the diVerent perceptions of these signals arise in part from diVerences in periodic envelope prominence and spectral envelope, and possibly in the stochastic versus deterministic natures of their respective waveforms.  2005 Elsevier B.V. All rights reserved. Keywords: Fish; Harmonic complex; Periodicity pitch; Residue pitch

1. Introduction

Abbreviations: a.c., alternating current; AM, amplitude modulated; cm, centimeters; d.c., direct current; dB, decibel; EXP., experiment; EXPS., experiments; f0, fundamental frequency; Hz, hertz; IIRN, inWnitely iterated rippled noise; Inc., incorporated; kHz, kilo Hertz; medc, median of suppression ratio for conditioned stimulus; medt, median of suppression ratio for generalization test stimuli; MFHS, missing-fundamental harmonic series; Ms, milliseconds; RN, rippled noise, one iteration; Sec, seconds; SR, suppression ratio; T, time delay (s) in synthesizing rippled noise ¤ Tel.: +1 773 508 2714; fax: +1 773 508 2719. E-mail address: [email protected]. 0378-5955/$ - see front matter  2005 Elsevier B.V. All rights reserved. doi:10.1016/j.heares.2005.02.006

Recent behavioral, neurophysiological, and neuroanatomical studies have shown that goldWsh have an auditory system with structures and functions analogous to those of most other vertebrates (e.g., Fay, 2000; Lu and Fay, 1993; McCormick, 1992). My lab is now engaged in a series of experiments to help determine what, if any, aspects of auditory perception in Wshes fundamentally diVer from those of more recently derived vertebrate groups generally having larger nervous systems and non-homologous auditory brain nuclei and

8

R.R. Fay / Hearing Research 205 (2005) 7–20

auditory receptor organs. In the present experiments, we have investigated the pitch-like and other perceptions evoked by tones, harmonic complexes, and rippled noise (RN) using stimulus generalization methods. Stimulus generalization paradigms can reveal aspects of perception that are probably more complex than those underlying detection and discrimination (e.g., HeVner and WhitWeld, 1976; Tomlinson and Schwartz, 1988). In a generalization experiment, animals are conditioned to a given stimulus, and then tested for responses to novel stimuli that may have acoustic elements or dimensions in common with the conditioning stimulus. If the animal does not generalize to the new stimuli, a strong conclusion is that the compared stimuli produce perceptions that fundamentally diVer in at least one respect. If generalization among stimuli does occur, we can conclude that at least one salient dimension of the evoked perceptions are substantially equivalent. Guttman (1963) has argued that a continuous gradient of generalization over one acoustic dimension is evidence for a corresponding perceptual dimension. Our previous experiments using stimulus generalization demonstrated that goldWsh conditioned to tones, tone complexes, amplitude-modulated tones, and pulse trains generalized to some test stimuli diVering in frequency, spectral region, and temporal pattern but failed to generalize to others (Fay, 1970, 1972, 1992, 1994, 1995, 1998a,b, 2000; Fay et al., 1996). This conditioning does not explicitly train the animal to discriminate among speciWc stimuli. Therefore, we believe that the degree of generalization observed indicates the natural criteria and perceptual dimensions that arise during conditioning and that are used by organisms in evaluating the test stimuli. So far, these experiments have indicated the existence of a perceptual dimension not unlike pure-tone or spectral pitch (Fay, 1970, 1992), and an additional dimension that is like periodicity pitch or roughness (Fay, 1972, 1994, 1995). In addition, we have found that goldWsh perceive repeated amplitude-modulated tone bursts to be more like the carrier-frequency tone when modulated using slow-onset ramps and more rapid-oVset damps than vice versa (Fay et al., 1996). Human listeners perceive these modulation asymmetries similarly (Patterson, 1994a,b). Finally, using stimulus generalization methods, we have demonstrated behavior consistent with the hypothesis that goldWsh listeners can perceptually segregate at least two simultaneous complex sound sources based on spectral and temporal features (Fay, 1998a, 2000). One acoustic dimension of great salience to goldWsh appears to be frequency. For example, animals conditioned to a pure tone (a spectral line) show substantially monotonic decrements in generalization to novel tones as they are further removed in frequency from the conditioning tone (Fay, 1970, 1992). However, we have also found that stimuli with energy in essentially the same spectral regions are not necessarily equivalent since gen-

eralization declines as the temporal Wne structure and envelope structure of the test stimuli diverges from that of the conditioning stimulus (Fay, 1972, 1994, 1995). Thus, animals conditioned to a periodic pulse train (having an harmonic line spectrum) show decrements in generalization to novel pulse repetition rates (i.e., fundamental frequency or f0) as they deviate from the f0 of the conditioning complex (Fay, 1995). Several new experiments reported here investigate the eVects of spectral region, temporal envelope, waveform, and degree of waveform randomness in determining the perceptions of complex sounds by the goldWsh. Experiments began with an investigation of the perceptions of RN in goldWsh. RN is a noise waveform constructed so that its power spectrum is cosinusoidally shaped (i.e., “rippled”). These signals evoke a perception of pitch in human listeners (e.g., Yost et al., 1978; Yost, 1996) that corresponds to the frequency spacing of the peaks of the power spectrum. Furthermore, goldWsh are able to discriminate between RNs having diVerent peak spacings with an acuity that approaches that of human listeners (Fay et al., 1983). Our goal was to investigate the pitchlike perceptions and their strengths evoked by various types of RNs. In the early phases of these experiments, we obtained unexpected results that led to a series of related experiments primarily using harmonic series and other complex stimuli. RN does not evoke a salient pitch-like perception observable in these experiments. Further experiments resulted in evidence that harmonic complexes evoke a perceptual dimension that is continuous with repetition rate or f0, but that in some cases additional acoustic features also play some role in determining the perceptual responses. Furthermore, we have found that goldWsh seem to use prominent periodic envelopes in synthesizing perceptions of many complex harmonic sounds whether or not the fundamental frequency component, or other low components, are present.

2. Methods 2.1. Animals Subjects were 168 common goldWsh (Carassius auratus) about 7–9 cm in total length, maintained in aquaria for three weeks to several months. The care and use of animals was approved by the Institutional Animal Care and Use Committee of Loyola University Chicago. 2.2. Experimental design All experiments were classical respiratory conditioning sessions followed on the next day by stimulus generalization test sessions. Eight animals were tested in each experiment, and each animal was used in only one experiment. The experiments determine whether and under

R.R. Fay / Hearing Research 205 (2005) 7–20

what conditions goldWsh behave as if the test stimuli had perceptual features in common with the conditioning stimulus. Based on a series of previous stimulus generalization experiments, this approach has identiWed several auditory perceptual dimensions evoked by simple and complex sounds including those that resemble pure tone or spectral pitch (e.g., Fay, 1970, 1992), periodicity pitch (Schouten et al., 1962; Fay, 1995), roughness (Terhardt, 1970; Fay, 1998b), and timbre (e.g., Fay, 1994, 1995). 2.3. Conditioning and testing These methods have been described in detail previously (e.g., Fay, 1992, 1995). GoldWsh were gently restrained in a cloth bag about 2 cm below the water surface, in the center of the acoustic test tank. A thermistor placed near the mouth measured water Xow during respiratory mouth movements. Respiration was deWned as the length of the recorded waveform minus the length occurring with no respiratory activity. The respiratory response was deWned as the ratio of the respiratory waveform’s length during the last 4 s of the conditioning or test stimulus to the sum of the latter and the waveform’s length 4 s preceding the stimulus (the suppression ratio, or SR). Respiratory suppression lasts several seconds as an unconditioned response following the unconditioned stimulus (a mild, 100-ms a.c. electric shock). Shock was delivered through steel screen electrodes placed at the animal’s head and tail. The shock occurred at the oVset of the conditioning sound stimulus. During conditioning, stimulus levels were varied randomly within a 24 dB range from trial to trial about an average of 40 dB above the behavioral threshold (Fay, 1969) for the spectral component having the greatest power, but remained Wxed at the 40 dB sensation levels during generalization tests. The acoustic test chamber was a Plexiglas cylinder 23 cm in diameter and 28 cm high. A swimming pool loudspeaker (University UW-30) at the bottom of the tank was buried in sand with the diaphragm upward about 2 cm below the sand surface. The chamber rested on a limestone slab inside an Industrial Acoustics Inc. single-walled audiometric booth. GoldWsh were conditioned and tested in two sessions. A session included 40 conditioning trials (random intertrial intervals at an average of 3 min). Generalization test trials then occurred in another session the next day. Generalization tests consisted of 40 trials (average intertrial interval of 1.5 min), and included eight test stimuli presented four times each in random order without shock. During the generalization test, the conditioning stimulus was presented with shock every Wfth trial to maintain the conditioned response. A median SR was calculated for each of the generalization stimuli presented during the test session. Generalization was normalized relative to the median SR to the conditioning stimulus measured during the general-

9

ization session, and expressed as a percentage. Percentage generalization was deWned as ((0.5 ¡ medT)/ (0.5 ¡ medC)) 100, where medT is the median SR to the generalization test stimulus, and medC is the median SR to the conditioning stimulus. Percent generalization values above 100% occurred in the rare event that the respiratory suppression response to a novel test stimulus was greater than that to the conditioning stimulus. Percent generalization values below 0% (negative values) occurred in the rare cases that respiration during the test stimulus increased rather than decreased. 2.4. Stimuli A wide variety of simple and complex stimuli were used in this series of studies, including pure tones, RN, inWnitely iterated rippled noise (IIRN), harmonic complexes with and without the fundamental and other low frequency components and with constant and random component frequencies or phases, repeated frozen noise segments, and amplitude-modulated noise. The next paragraphs describe the common features of signal generation and analysis followed by descriptions of how each of these signals was deWned and created. All signals were digitally synthesized as 6-s waveforms with 20 ms raised-cosine rise and fall times at a 5 kHz sample rate, and then low-pass Wltered at 2 kHz before attenuation and power ampliWcation. Signals were recorded using a B & K 8103 miniature hydrophone replacing the Wsh in the restrainer in the test tank. For calibration, hydrophone outputs were digitally recorded at 5 kHz and spectra computed (Matlab) on 4096 points (820 ms segments) during the steady state (in the approximate temporal center of the 6-s signal duration). Signal envelopes were deWned as the absolute value of the Hilbert Transform (Matlab) of the waveform, and envelope magnitude is deWned as the magnitude of the 100-Hz frequency component of the envelope’s spectrum in dB with respect to the envelope’s d.c. component magnitude. 2.5. Pure tones In experiments 1A–C, individual sinusoidal functions were synthesized (Fig. 1A). 2.6. Rippled noise For experiments 1A, 2A, B and 3A, B RN was used as a conditioning or test stimulus. RN has a cosinusoidally shaped power spectrum with successive peaks separated by a constant frequency, similar to an harmonic complex. RN is generated when a wideband noise is delayed and the delayed repetition is added back to the undelayed version. The power spectrum peaks occur at integer multiples of 1/T Hz, where T is the delay in seconds.

10

R.R. Fay / Hearing Research 205 (2005) 7–20

For example, a delay of 10 ms produces peaks at 100 Hz (1/0.01 s) and at successive integer multiples of 100 Hz. Human listeners perceive RN to have a pitch equivalent to 1/T Hz (Yost et al., 1978), and these signals produce inter-spike interval distributions in goldWsh auditory nerve Wbers having probability density peaks at T sec and its integer multiples (Fay et al., 1983). Furthermore, goldWsh can discriminate between RNs having diVerent delays with an acuity that approaches that of human listeners (Fay et al., 1983). RN was synthesized using a variety of delay values between 5 and 20 ms (nominal pitches between 50 and 200 Hz). See Fig. 1B. 2.7. InWnitely iterated rippled noise RN described above can be referred to as “once-iterated” RN. In synthesizing IIRN, the iterations can be increased in number to that approaching inWnity by adding back the delayed noise through positive feedback from the output to the input of a summing ampliWer (e.g., Shofner and Yost, 1995). Positive feedback has the eVect of sharpening the spectral peaks so that the power spectrum is no longer cosinusoidally shaped, but has more the look of the teeth of a comb or picket fence. IIRN is sometimes referred to a “comb-Wltered” noise

(Hartmann, 1998). For human listeners, IIRN has a pitch strength greater than RN (Yost, 1996). Both RN and IIRN have statistical waveform regularities and periodic but weak envelopes, but their waveforms are stochastic and not periodic. IIRN signals were synthesized with delays (T) between 5 and 20 ms (Fig. 1F). Harmonic complexes (Fig. 1C). In experiments 1B, 1C, 3, 4, 5, and 6, harmonic complexes were synthesized by adding 20 equal-amplitude sinusoids; in some cases all at 00 starting phase, and in other cases with randomly selected starting phases. The fundamental frequency (f0) of each complex is deWned as the inverse of the shortest period of time over which the waveform repeats itself exactly. Harmonic complexes were synthesized with f0s between 60 and 500 Hz. In some experiments, the f0 component and several successive lowest components were absent, creating a “missing fundamental” complex. Missing fundamental complexes (Fig. 1D). These were harmonic complexes synthesized without the fundamental frequency component. Some missing-fundamental harmonic complexes were also synthesized with starting phase angles selected at random (experiments 4E and F) (Fig. 1E). The amplitude Xuctuation of all 00 starting phase harmonic complexes is highly peaked, resembling a periodic train of pulses. Manipulation of starting phase

Fig. 1. Waveforms, spectra, and envelopes of selected signals used in these experiments. In the left column is a decibel value denoting the amplitude of the envelope (The amplitude of the 100 Hz component of the spectrum of the absolute value of the Hilbert Transform) in dB with respect to the DC component. (A) 100-Hz pure tone, (B) 100-Hz pitch rippled noise, (C) harmonic complex, zero-phase, (D) Missing fundamental harmonic complex, zero phase, (E) 100-Hz missing fundamental complex, random phase (see text), (F) 100-Hz pitch inWnitely iterated rippled noise, (G) 10 ms period frozen noise (see text), (H) 10 ms period amplitude modulated (AM) noise. (I) Harmonic complex with mistuning of each component (see text).

R.R. Fay / Hearing Research 205 (2005) 7–20

for the successive harmonics can reduce the magnitude of the envelope. In order to reduce the envelope as much as possible, 20 complexes were synthesized with random starting phase selection and hydrophone recordings were analyzed to identify the complex having the smallest envelope magnitude (root mean square value of the absolute values of the Hilbert Transform). This signal was then used in experiments. 2.8. Frozen noise (Fig. 1G) In experiment 5A, the conditioning stimulus was synthesized by repeating an identical, 10-ms segment of wideband noise, without silent gaps, 100 times per sec. The noise segment was modiWed slightly such that the beginning and ending points were made equal in amplitude so that its periodic repetition would not have a transient at each segment beginning and end. Since the same noise segment was repeated exactly throughout the 6-s stimulus, this signal is called “frozen noise.” This produced a periodic signal with reduced envelope (see Fig. 1G). This signal is quite similar to the random phase harmonic complex described above in having a reduced envelope and an harmonic line spectrum. In this case, however, the 100-Hz f0 component was present. 2.9. Amplitude modulated noise (Fig. 1H) In experiment 5B, the conditioning signal was wideband noise amplitude modulated with a gaussian-shaped envelope of 3.5 ms duration at half its maximum amplitude, repeated 100 times per sec. This produced a stochastic signal having an envelope with a periodicity of 100 Hz, but a random noise carrier signal. Thus, the spectrum for this signal is like that of continuous noise (see Fig. 1H).

11

waveform and envelope, but with a spectral pattern very much like that of true harmonic complexes.

3. Results with discussion 3.1. Experiments 1A–C In these experiments series, goldWsh were (A) conditioned to a 166-Hz pure tone, and then tested for generalization to RN at various delays (or nominal pitches), (B) conditioned to a 100 Hz f0 harmonic complex and then tested using pure tones, or (C) conditioned to a 100 Hz pure tone and then tested using an harmonic complex of various f0. Generalization gradients are shown in Fig. 2. Fig. 2A shows that following conditioning to a 166 Hz pure tone, there was no generalization at all to RN stimuli, one example of which had a nominal pitch (1/T) of 166 Hz. Thus, goldWsh perceive a tone and a RN to have very little, if anything, in common in spite of the fact that one of the RN delays presented produces a pitch for human listeners equivalent to that of a 166 Hz pure tone. This lack of perceptual equivalence could be due to any or all of the following stimulus/response characteristics: (1) Bandwidth: The RN spectrum is wide while that of the tone is narrow; (2) Randomness: The RN signal is stochastic while that of the tone is deterministic; (3) Complex pitch: The RN may have produced no perceptual

2.10. Harmonic complexes missing 0–2 of the lowest components In experiment 6, test stimuli were harmonic complexes with 0–2 of the lowest harmonics omitted (depending on f0) so that the lowest component of all stimuli was as near as possible to 200 Hz. This manipulation ensured that spectral envelope and the frequency of the lowest component were approximately constant for all stimuli and could not serve as a salient cue in generalization tests. 2.11. Mistuned complexes (Fig. 1I) In experiment 7, the generalization test stimuli were nominally harmonic complexes except that the exact frequency of each successive component was randomly varied (rectangular probability density) by plus and minus a maximum of 30 Hz, giving rise to a noise-like (non-periodic)

Fig. 2. Generalization gradients for experimental series 1 (ABC). Lines connect the means (n D 8), and horizontal symbols show means plus 1 standard error. In label to the left, C refers to the conditioning stimulus, and T refers to test stimuli. The unWlled arrow marks the value of the independent variable predicted to cause the most generalization. (A) Conditioning D 166 Hz pure tone, test D RN. (B) Conditioning D 100 Hz Harmonic complex, test D pure tone frequency. (C) Conditioning D 100 Hz pure tone, test D harmonic complex of various f0.

12

R.R. Fay / Hearing Research 205 (2005) 7–20

response equivalent to the pitches of tones and RN perceived by human listeners; (4) Envelope: The RN envelope Xuctuated while that of the tone was invariant. In order to begin an investigation of alternatives 2, 3 and 4 above (randomness, complex pitch existence, and envelope Xuctuation), experiments 1B and 1C were conducted. In experiment 1B, goldWsh were conditioned to a 100 Hz f0 harmonic complex and then tested to pure tones at a range of frequencies, including f0. Previous studies showed that goldWsh behave as if periodic pulse trains have a perceptual feature (possibly, pitch) that is monotonic and continuous function of f0 (Fay, 1995 and see below). Fig. 2B shows this generalization gradient. Here, as in the previous experiment, there was little or no generalization to pure tones after conditioning to a 100 Hz f0 pulse train. In Exp. 1C, goldWsh were conditioned to a 100-Hz pure tone and then tested for generalization to an harmonic series (pulse train) with a variety of f0, including 100 Hz (Fig. 2C). Note that both the conditioning stimulus and one of the test stimuli included a major spectral component at 100 Hz (f0), and that human listeners judge the pitch of an harmonic series to be equivalent to the pitch of a pure tone at f0 in this range of f0s. Thus, the lack of generalization between a pure tone and complex tones is not necessarily explained by the absence of equivalence along a pitch-like perceptual dimension, but possibly by a simple diVerence in spectral bandwidth. Whatever the perceptions of these two types of signal may have in common to goldWsh listeners, they clearly diVer in timbre to human listeners – one is a smooth pure tone, and the other is a wideband pulse train. Perhaps a timbre diVerence determines their very diVerent perceptions by goldWsh.

gradient of Fig. 3B shows essentially the same result as in experiment 2A using RN; goldWsh respond to all IIRN signals as if they were equivalent. Here again, there is no evidence that IIRN produces a prominent pitch-like perception in goldWsh, and the similarities among all IIRN signals, as well as RN signals, may have dominated the generalization behaviors observed. 3.3. Experiments 3 In order to eliminate some of these global similarities among all RN and IIRN signals, a new experimental series was conducted in which goldWsh were conditioned to an IIRN signal and then tested for generalization to harmonic series having a variety of f0s, and vice versa. In experiments 3A–C, three groups of animals were conditioned to IIRN signals having nominal pitches of 100, 147, and 200 Hz, respectively, and then tested for generalization to harmonic complexes varying in f0. In this case, the IIRN and harmonic signals shared bandwidth and nominal pitch (for one of the test stimuli), but diVered in that the IIRN stimuli had stochastic waveforms and weak envelopes while the harmonic complexes were deterministic with prominent periodic envelopes. Fig. 4A–C shows these three generalization gradients. There was little generalization among these stimuli, indicating that goldWsh perceived them to be essentially dissimilar. The animals conditioned to the IRN with 100 and 200 Hz nominal pitch (Fig. 4A and C ) generalized most to the harmonic series having these respective f0s, but this was not the case for experiment 3B (Fig. 4B) where animals were conditioned to the IRN with 147 Hz pitch. In general, generalization increased

3.2. Experiments 2 In this experimental series, the possible pitch-like perceptual responses of goldWsh to RN stimuli were investigated more directly by conditioning animals to an RN signal having a nominal pitch of 163 Hz, and then testing for generalization to a series of RN signals having diVerent pitches to human listeners, including 163 Hz. Fig. 3A shows this generalization gradient. Clearly, animals generalized robustly to all RN pitches approximately equivalently. Here again, there is no evidence that RN signals evoke a pitch-like perception in goldWsh, and the behavior here may have been dominated by the essential similarities among all RN stimuli (e.g., stochastic waveforms, Xuctuating envelopes, equal bandwidths). The strength of RN pitches for human listeners is enhanced by producing IRN (see Shofner and Yost, 1995 and Section 2 above). To test whether RN iteration could enhance a pitch-like perception in goldWsh, experiment 2B was conducted in which animals were conditioned to an IIRN with a nominal 163 Hz pitch and then tested to other IIRN stimuli having diVerent pitches. The generalization

Fig. 3. Generalization gradients for experimental series 2 (AB). In label to the left, C refers to the conditioning stimulus, and T refers to test stimuli. The unWlled arrow marks the value of the independent variable predicted to cause the most generalization. Lines connect means (n D 8), and horizontal symbols show means plus 1 standard error. (A) Conditioning D 166 Hz RN, test D RN. (B) Conditioning D 166 Hz IIRN, test D IIRN.

R.R. Fay / Hearing Research 205 (2005) 7–20

13

Fig. 4. Generalization gradients for experimental series 3 (A–F). Lines connect means (n D 8), and horizontal symbols show means plus 1 standard error. In label to the left, C refers to the conditioning stimulus, and T refers to test stimuli. The unWlled arrow marks the value of the independent variable predicted to cause the most generalization. (A) Conditioning D IIRN 100 Hz, test D harmonic series of various f0. (B) Conditioning D IIRN 147 Hz, test D harmonic series of various f0. (C) Conditioning IIRN of 200 Hz, test D harmonic series of various f0. (D) Conditioning D harmonic signal of 100 Hz f0, test D IIRN of various delays. (E) Conditioning D harmonic signal of 147 Hz f0, test D IIRN of various delays. (F) Conditioning D harmonic signal of 200 Hz f0, test D IIRN of various delays.

somewhat variably with f0, indicating that in all cases, goldWsh judged the harmonic complexes at the higher f0s to be more like IRN signals. These results suggest that the global features that distinguish between IRN and harmonic complexes (stochastic vs. deterministic waveforms and envelopes) dominate the perceptions of goldWsh, and that possible pitch perceptions evoked by IRN or harmonic complexes are weak or non-existent by comparison. Three complementary experiments were also run in this series (experiments 3D, 3E, and 3F) in which goldWsh were conditioned to harmonic complexes with f0s of 100, 147, and 200 Hz, respectively, and then tested to a variety of IRN delays. These generalization gradients are shown in Fig. 4D–F. Here again, there is relatively little generalization between these two signal types. In panels D and E, there is most generalization to the IRN signals with a nominal pitch equal to the conditioning harmonic series f0, but this is not the case for the 200 Hz IRN (panel F). The generalization gradients of this experimental series suggests that: (1) There is little perceptual similarity between IRN and harmonic complexes in spite of the fact that their spectral patterns and bandwidths are similar and some stimuli share pitches as judged by human listeners; (2) The perceptual diVerences between these signal types may derive from the fact that the IRN signals are essentially stochastic in waveform and enve-

lope while the harmonic complexes are deterministic in these respects, with prominent periodic envelopes; (3) The overall weak generalization between IRN signals and harmonic complexes could also arise from large diVerences in the strengths of pitch-like perceptual dimensions evoked by IRN and harmonic complexes. Since the evidence so far indicates that IRN evokes little or no pitch, the next experimental series was begun to determine the extent to which harmonic complexes may or may not evoke a pitch-like perceptual dimension. 3.4. Experiment 4A In this experiment, goldWsh were conditioned to an harmonic complex with f0 D 100 Hz, and then tested for generalization to harmonic complexes with diVerent f0s. It was expected that to the extent that this conditioning signal evoked a qualitative perceptual dimension similar to pitch, the generalization gradient would be peaked near the f0 used in conditioning. Fig. 5A shows this generalization gradient. Here, robust generalization occurs only at f0s at or near 100 Hz, with a rather steep and substantially monotonic decline in response as the test f0 deviates from the conditioning f0. These data are consistent with the hypothesis that these harmonic complexes evoke a qualitative perceptual dimension that may be analogous to pitch perception in human listeners. Therefore, one

14

R.R. Fay / Hearing Research 205 (2005) 7–20

Fig. 5. Generalization gradients for experimental series 4. Lines connect means (n D 8) and horizontal symbols show means plus 1 standard error. In label to the left, C refers to the conditioning stimulus, and T refers to test stimuli. The unWlled arrow marks the value of the independent variable predicted to cause the most generalization. (A) Conditioning D harmonic signal of 100 Hz f0, test D harmonic series of various f0. (B) Conditioning D missing fundamental harmonic complex, test D full harmonic complex of various f0. (C) Conditioning D 100 Hz f0 missing fundamental harmonic complex, test D missing fundamental harmonic complex of various f0. (D) Conditioning D full harmonic complex of 100 Hz f0, test D missing fundamental complex of various f0. (E) Conditioning D zero phase missing fundamental complex of 100 Hz f0, test D missing fundamental complex, random phase, various f0. (F) Conditioning D missing fundamental, random phase, 100 Hz f0, test D missing fundamental, zero phase, various f0.

possible explanation for the lack of generalization between IRN and harmonic complexes is that the former may not evoke a pitch-like perception as strong as the latter. 3.5. Experiments 4B–D These experiments were designed to investigate whether the “missing-fundamental” phenomenon of human hearing (Schouten, 1970) could be observed in goldWsh. These experiments are essentially like experiment 4A with the following exceptions: In experiment 4B, animals were conditioned to the 100-Hz f0 complex with the f0 component missing, but tested using full complexes with all fundamentals present (Fig. 5B). In experiment 4C, both the conditioning and test complexes missed their f0components (Fig. 5C). In experiment 4D, the conditioning complex had the f0 component present, but the test complexes all had the f0 component missing (Fig. 5D). Experiments 4C and D (Fig. 5C and D) resulted in generalization gradients that are essentially similar to than of experiment 4A with all f0s present. From these experiments, we can conclude that goldWsh do not “miss” the missing fundamental, and that the generalization behavior probably depends on the spectral or temporal envelope pattern. However, the gradient of experiment 4B is quite diVerent from the others in this series, showing generalization that monotonically increases with f0, and with no indication of robust gener-

alization at the f0 of the conditioning complex. Thus, there are some conXicting data regarding whether or not goldWsh “miss” a missing fundamental with respect to the hypothesis of a pitch-like dimension evoked by a prominent, periodic envelope (see Section 4). Experiments 4E and F were carried out to test the notion that a prominent periodic envelope evokes the putative pitch-like dimension indicated in some of these missing-fundamental experiments. In experiment 4E, the same conditioning complex of experiment 4C was repeated (missing fundamental, zero phase components), but the test stimuli consisted of missing-fundamental test complexes with random component phases chosen to reduce envelope prominence as much as possible (Fig. 5E – see Section 2 and Fig. 1). In complementary experiment 4F, the conditioning complex had random phase components, but the test stimuli were all zero-phase as in experiment 4C. Here again, we see ambiguous results. The gradient of experiment 4E (Fig. 5E) is somewhat peaked at a 100-Hz f0, as in experiment 4C (Fig. 5C), suggesting that randomizing the phases (and reducing the amplitude of the envelope) of the test stimuli had only little eVect (a somewhat broader generalization gradient) on the demonstration of a pitch-like perceptual dimension. However, the results of experiment 4F (Fig. 5F) are very much like that of 4B (Fig. 5B) indicating that generalization monotonically increases with f0 and with no indication of robust generalization to the 100Hz f0 used in conditioning (Fig. 5F – see Section 4).

R.R. Fay / Hearing Research 205 (2005) 7–20

Fig. 6. Generalization gradients for experimental series 5. Lines connect the means (n D 8). In label to the left, C refers to the conditioning stimulus, and T refers to test stimuli. The unWlled arrow marks the value of the independent variable predicted to cause the most generalization. (A) Conditioning D frozen noise repetition at 100 Hz, test D missing fundamental harmonic complex, various f0. (B) Conditioning D AM noise with 100 Hz envelope repetition, test D missing fundamental harmonic complex of various f0.

3.6. Experiments 5 These two experiments were carried out to additionally help determine whether the perceptual behavior revealed in experiment 4A above depended on the prominent, periodic envelope of the harmonic series, or whether it is waveform repetition, per se, that evokes the apparent pitch-like dimension. In experiment 5A, the conditioning stimulus was a 10-ms segment of “frozen” noise repeated 100 times per sec, and the animals were tested for generalization to the same set of harmonic complexes used in experiment 4A. If waveform repetition, per se, was the important factor, the generalization gradient in this case would show a peak at the f0 corresponding to the noise segment repetition rate. In experiment 5B, goldWsh were conditioned to a continuous, wideband noise modulated in amplitude by a periodic, pulsatile envelope, and then tested for generalization to harmonic series having a variety of f0s. If the envelope per se were the important factor in evoking this perceptual dimension, the generalization gradient would also tend to peak at the f0 corresponding to the 100-Hz noise modulation rate. Fig. 6A and B show these generalization gradients. The gradient in Fig. 6A shows robust generalization overall, but with a clear peak at the 100-Hz repetition rate. This indicates that waveform repetition alone (in spite of a reduced temporal envelope prominence) may evoke some degree of the pitch-like perceptual dimension. The gradient in Fig. 6B is low overall, with no clear peak at the 100-Hz repetition rate. This indicates that envelope Xuctuation, alone, is not a necessary stimulus feature in evoking the pitch-like perceptual dimension.

15

However, note that in this experiment, goldWsh failed to generalize robustly from a stochastic waveform (AM noise) to a deterministic one (harmonic complex). Thus, a general random or noisy timbre-like quality (e.g., a “noisiness”) may have played a role in determining the essential non-equivalence of the conditioning and test stimuli. The generalization gradients of Fig. 5B and F are the only data so far inconsistent with the hypothesis that missing-fundamental, harmonic stimuli are essentially equivalent to full harmonic complexes in evoking a pitchlike perceptual dimension. In both experiments, generalization increased monotonically with f0 without a response peak at 100 Hz. One possible explanation for the gradient of Fig. 5B (conditioned with f0 missing, tested all with f0s present) is that generalization varied with the frequency of the lowest frequency component present in the conditioning and test stimuli that may have been perceptually resolved. This lowest component was 200 Hz (100-Hz f0 missing) in the conditioning stimulus and ranged between 60 and 200 Hz in the test stimuli (all f0s present). Thus, a gradient increasing toward a 200-Hz f0 could result if the animals responded in proportion to the proximity of the lowest frequency components of the test stimuli to that of the conditioning stimulus (200 Hz). This hypothesis was evaluated in experiment 6 as described below. However, this same hypothesis could not account as well for the monotonically increasing generalization of Fig. 5F for which the conditioning stimulus had random phase components. Here, both conditioning and test stimuli had f0 absent, as was the case for the gradient of Fig. 5C which shows at peak at 100 Hz. It is not clear why randomizing the phases of the conditioning stimulus transform the generalization gradient from one peaking at 100 Hz (Fig. 5C) to one monotonically increasing as f0 approaches 200 Hz (Fig. 5F). Experiment 6A used a conditioning stimulus with a missing 100-Hz f0 (all at 0 phase) and test stimuli that had 0–2 of the lowest frequency components omitted so that the lowest frequency component of all stimuli was as near as possible to 200 Hz. This served as a test of the notion that these generalization gradients depended on the frequency of the lowest component that was possibly resolved by the goldWsh auditory system. The gradient of Fig. 7A shows the pattern typical for most of the missing-fundamental experiments; an inverted “V” peaking at f0 D 100 Hz. This result demonstrates that gradients of this type are not caused by the frequency of the lowest component (see experiment 4B), but are more likely due to envelope periodicity which is generally robust with respect to missing spectral components. Note that in this case, the spectral envelopes of all conditioning and test stimuli are essentially equivalent and could not be controlling the generalization pattern. The importance of the periodic envelope was investigated directly in experiment 6B which used stimuli

16

R.R. Fay / Hearing Research 205 (2005) 7–20

cues, the absence of pitch cues in the test stimuli leads to essential non-equivalence of the conditioning and test stimuli (essentially no generalization at all), and thus there was no opportunity for possible spectral edge cues to shape the generalization gradient as these cues may have operated in experiment 4B and F.

4. Discussion 4.1. Interim summary of results This is a long and complex series of experiments, so before discussing these results, we present a results summary, beginning with the Wndings that are the most clear and least ambiguous. Points for later discussion are raised here.

Fig. 7. Generalization gradients for experimental series 6. Lines connect the means (n D 8) and horizontal symbols show means plus 1 standard error. In label to the left, C refers to the conditioning stimulus, and T refers to test stimuli. The unWlled arrow marks the value of the independent variable predicted to cause the most generalization. (A) Conditioning D missing fundamental harmonic signal at 100 Hz f0, test D harmonic complex with lowest component near 200 Hz. (B) Conditioning D missing fundamental harmonic signal at 100 Hz f0, test D harmonic complex with lowest component near 200 Hz, all components mistuned. (C) Conditioning D missing fundamental harmonic signal at 100 Hz f0, test D all harmonics present, but mistuned.

identical to experiment 6A with the exception that each frequency component of the generalization test stimuli was slightly shifted randomly (mistuned), creating a noise-like waveform but with a line spectrum essentially similar to the missing-fundamental conditioning stimulus. These results are shown in Fig. 7B. Here, there was very little overall generalization and only a very weak tendency to peak at f0 D 100 Hz. Experiments 6A and B demonstrate that a prominent periodic envelope is important in evoking a complex pitch-like perceptual dimension. Experiment 6C (Fig. 7C) used a missing-fundamental, zero-phase conditioning complex as in experiments 6A and B, but generalization test stimuli having all harmonics present but mistuned as in experiment 6B. It was hypothesized that if the frequency of the lowest component controlled the pattern of generalization as in experiments 4B and 4F (Fig. 5B and F), a gradient similar to these would result, especially since the harmonic components were mistuned and the periodic envelope cue greatly reduced. The resulting gradient in Fig. 7C shows, however, very little generalization at all, and no hint of the monotonic, upwardly sloping generalization patterns seen in Fig. 5B and F. Therefore, it is concluded that after conditioning to a complex having robust pitch

1. There is essentially no perceptual equivalence (generalization) among pure tones, RNs, and harmonic complexes for goldWsh, in spite of all three signal types having the same fundamental frequency and pitch value to human listeners (Fig. 2). This nonequivalence could arise from diVerences in bandwidth, spectral envelope, temporal envelope prominence, “noisiness” or randomness, or the lack of a pitch dimension evoked by at least one of the signal types compared in the generalization experiments. 2. RNs of all pitch values (delays) are perceived as essentially equivalent in generalization experiments (Fig. 3). This suggests that a pitch dimension evoked by RNs is either absent or weak in controlling generalization behavior compared with other features of the stimulus (e.g., its “noisiness,” envelope, and bandwidth). 3. There is only a very weak and inconsistent tendency for IRN noises to evoke a pitch-like dimension between 100 and 200 Hz when compared with harmonic complexes having f0s that match the IRN pitch (Fig. 4). This non-equivalence occurs in spite of the comparable bandwidths of these two signals, so the lack of behavior consistent with IRN pitch must depend on a diVerence in “noisiness” between IRN and harmonic tones, or simply a weak or absent IRN pitch evoked in generalization experiments. 4. Harmonic complexes with a 100-Hz f0 appear to evoke a pitch-like dimension in goldWsh because the generalization gradients peak at the f0 of the conditioning complex (Fig. 5A). But what possible alternative explanations are there for these generalization gradients? 5. In most but not all experiments, goldWsh respond equivalently to harmonic complexes with and without the f0 component (Fig. 5C–E; 6A). In other

R.R. Fay / Hearing Research 205 (2005) 7–20

6.

7.

8.

9.

10.

words, goldWsh do not generally miss the fundamental when it is absent from the stimuli. However, questions remain here. Why does generalization increase monotonically with f0 in two cases studied (i.e., why do goldWsh appear to miss the fundamental in experiments 4B and F (Fig. 5B and F)? Does the absence of the f0 component alter the pitch salience of a complex? Are any of the harmonic complex components resolved by peripheral Wltering? In one of two experiments (compare Fig. 5E and F), reducing the prominence of the envelope by randomizing component phases caused a dramatic change in the generalization gradient of f0 for harmonic complexes, suggesting that pitch value had been lost. But why did this happen in one experiment but not in another complementary one (Fig. 5E)? What does this suggest about the relative importance of the temporal and spectral envelopes in evoking a pitch-like dimension in goldWsh? Frozen noise repetition seems to evoke a pitch-like dimension (Fig. 6A) something like that evoked by both zero-phase and random-phase harmonic complexes (Fig. 5C and E). Is this behavior determined only by deterministic waveform periodicity, or is there also a role played by the Wnite but weak envelope in this case? Conditioning to amplitude-modulated (AM) noise does not result in generalization to harmonic complexes with the same or other repetition rates (Fig. 6B). Is this lack of generalization due to the relatively weak and somewhat variable envelope of AM noise (see Fig. 1), or to the stochastic, nonperiodic nature of the AM noise waveforms? Conditioning to missing-fundamental harmonic complexes and testing using harmonic complexes with from 0 to 2 of the lowest components missing (thus keeping the frequency of the lowest component near 200 Hz) results in the evocation of a pitch-like dimension in generalization tests (Fig. 7A). Therefore, the frequency of the lowest component or the overall spectral envelope does not determine the inverted “V”-shaped generalization gradients evident in many of these experiments. When waveform and envelope periodicity is eliminated for these test stimuli by mistuning each component, little or no generalization occurs (Fig. 7B). However, when the conditions of experiment 4B (Fig. 5B) are repeated with random mistuning of all components, the experiment 4B results are not replicated (experiment 6C). Rather, little or no generalization occurs at all. Thus, the frequency of the lowest component or overall spectral shape does not account for the results of experiment 4B in the absence of prominent periodicity among test stimuli.

17

In general, the results of these experiments are consistent with the hypothesis that periodic waveforms and envelopes evoke a perceptual dimension in goldWsh that is like that of complex or musical pitch observed in human listeners. Furthermore, most experiments are consistent with the hypothesis that this pitch-like dimension is evoked using harmonic stimuli that are missing the fundamental frequency (f0) and other of the lowest harmonic components. In this respect, the auditory perceptions of goldWsh closely resemble perceptions evoked by similar stimuli in human (e.g., Schouten, 1970) and other vertebrate (e.g., HeVner and WhitWeld, 1976) listeners. This sort of auditory perception appears to be a general feature of vertebrate sound perception. However, it is not clear whether the neural coding mechanisms and computations underlying these behaviors are the same among vertebrates, or are simply functionally equivalent reXections of possibly diverse mechanisms adapted to solve general problems of sound source determination. Consider, for example, the perception of the advertisement call of male toadWsh (Brantly and Bass, 1994), an harmonic sound with a fundamental frequency of about 100 Hz. If the amplitudes of individual frequency components were an important part of the message, then useful communication would critically depend on the characteristics and quality of the communication channel; the message might be misinterpreted if the channel Wltered out one or more of the frequency components. This would occur, for example, when shallow water precluded the transmission of certain components below the depth-related “cut-oV” frequency. But the perception of harmonic complexes has been shown to be robust with respect to missing fundamentals (and other missing components) in Wshes, ensuring that the characteristics of the communication channel has little impact on the interpretation of messages that Xow through it. In this way, useful communication can take place in spite of acoustical variations of the communication channel; the message becomes independent of the medium. In addition, these multiple experiments have raised other issues concerning the acoustic features of complex sounds that control behavior in these generalization experiments. These issues include (A) The role of spectral envelope and bandwidth, particularly the value of the lowest frequency component present, (B) the role of a randomness or “noisiness” physical dimension in timbre perception, and (C) The roles of envelope versus waveform periodicities, and their relative strengths. 4.2. The role of spectral envelope Spectral proWle is a powerful determinant of auditory response in goldWsh (e.g., Fay, 1972, 1992, 2000). Traditionally, this has been predicted and treated as following from the assumption of time-domain, rather than as a frequency-domain, encoding and neural computation

18

R.R. Fay / Hearing Research 205 (2005) 7–20

(e.g., von Frisch, 1936; Fay, 1978; Crawford, 1997) because Wshes lack a cochlea-like frequency analyzer at the periphery, and peripheral frequency selectivity is relatively crude by vertebrate standards (Fay and Ream, 1986; Fay, 1997). However, it is now well established that the primary auditory aVerents of all vertebrate species are frequency-selective to some degree regardless of whether the ear performs a cochlea-like, macromechanical frequency analysis (Fay and Popper, 2000). Thus, the question remains for all vertebrates, including Wshes, regarding how and the extent to which peripheral frequency analysis, in addition to time-domain processing, is used in auditory perception. In human hearing, for example, the mechanisms underlying complex pitch perception are thought to be diVerent depending on whether individual frequency components (e.g., the lower harmonics of a harmonic complex) are resolved by the auditory system, or not. In the present experiments, it was initially assumed that no harmonics of the 100-Hz f0 stimuli used here were resolved individually, or that the lower edges of the spectral proWles of the stimuli used could not be important determinants of the behaviors observed. However, at least one present result suggests otherwise. Experiment 4B (Fig. 5B) shows a generalization gradient monotonically increasing with the f0 of the test stimuli toward 200 Hz. The conditioning stimulus had the 100-Hz f0 missing so its lowest component was 200 Hz. Here, the generalization gradient is most parsimoniously interpreted as reXecting the proximity of the lowest frequency component present in both conditioning and test stimuli; the closer the test stimuli f0 was to 200 Hz, the greater the generalization, regardless of the envelope periodicity. This hypothesis is that at least the lower edges of the stimulus spectra could determine the most eVective qualities of the stimuli. If this is true, is it possible that this sort of hypothesis could account for all the inverted “V”-shaped generalization gradients of this experimental series? For example, one could argue that the inverted “V”-shaped gradient of Fig. 5A results from the absolute diVerence between the frequencies of the lowest components present in the conditioning stimulus (low component D 100 Hz) and the generalization test stimuli (60–200 Hz). Since this gradient is only slightly wider than that determined with single-tone conditioning and test stimuli (Fay, 1972), one could suggest that for all complex stimuli used, the lowest component (f0) was perceptually resolved and used to control the generalization behavior. That this is not necessarily the case is clearly shown in Fig. 7A where results similar to Fig. 5A were obtained using stimuli that all had lowest frequency components that were approximately equal (near 200 Hz). This result leaves us to conclude that it may remain possible for goldWsh to evaluate complex harmonic stimuli (f0s between 60 and 200 Hz) using the frequency of the lowest component(s), but that waveform or envelope periodicity is also primarily used for percep-

tual evaluation, especially when the spectral cues are eliminated or substantially reduced. Thus, harmonic complexes may be encoded and processed with respect to multiple acoustic features, probably both in the time domain and in the frequency domain. This same conclusion would probably be reached by most researchers in studies on human pitch perception (De Cheveigné, 2005). In this case, we tentatively conclude that vertebrates share this multi-dimensional dependence on acoustic features in synthesizing complex auditory perceptions. In this context, the monotonically increasing gradient of Fig. 5F is enigmatic. In this case, all stimuli lacked the f0 component but the conditioning complex had all components added at random phases. Assuming that the phase randomization during conditioning slightly reduced pitch salience based on reduced envelope prominence (see Fig. 5E), the only attempt at explanation that can be oVered at this time is that some cue or cues, other than envelope periodicity, eVectively dominated in controlling generalization behavior. However, we have no simple or obvious suggestion for what these other cues may have been. Thus, this result remains to be understood. Also enigmatic is the result of Exp. 6C (Fig 7C) wherein the envelope and waveform periodicity were eliminated from the test stimuli by mistuning each component, and where there was essentially no generalization following conditioning to a stimulus having a prominent periodicity. This experiment was designed as an attempt to reproduce the gradient type in Fig. 5F by eliminating the f0 component in the conditioning stimulus but not in the test stimuli. In the absence of periodicity in the conditioning stimulus, it was hypothesized that possible spectral cues (i.e., the frequency of the lowest component) would control the response. The results in Fig. 7C do not support this hypothesis. They suggest, instead, that the absence of any periodicity in the test stimuli rendered their perception essentially unlike those of the periodic conditioning stimulus, leading to essentially zero generalization. It is possible to suggest that an eVect of spectral envelope may exist, but could not be revealed under conditions of essentially zero generalization overall.

5. The role of noisiness The results of experiments 1 and 3 demonstrate that there is very little, if any, equivalence between rippled noise (RN and IRN) and harmonic complexes for goldWsh. This occurs in spite of the fact that both kinds of stimuli have the same pitch to human listeners, the same bandwidth, and the same overall spectral shape with components only at f0 and its harmonics (see Fig. 1A and C). The acoustic dimensions that distinguish these

R.R. Fay / Hearing Research 205 (2005) 7–20

two types of sounds are primarily the essentially random waveform of the RN versus the deterministic waveform of the harmonic complexes, and the weak envelope of RN stimuli versus deterministic and robust envelopes of the harmonic complexes. These experiments cannot help decide which of these two variables is most likely responsible for the generalization results of Figs. 2 and 4. But the fact that these two stimuli appear to be so utterly unlike one another appears to be more likely caused by a qualitative rather than a quantitative diVerence: i.e., the “noisiness” alternative seems most likely. Noisiness as a perceptual attribute is in contrast to harmonicity, a perceptual attribute that is well accepted in the human hearing literature (e.g. Hartmann, 1988). Harmonicity is a term meant to capture all the qualities of an harmonic sound that makes it distinctive and of a category. In the present context however, RN could be said to be “harmonic” in the sense that it is comprised of harmonically spaced spectral peaks, and thus does not diVerentiate well between RN signals and harmonic complexes. Noisiness, on the other hand, is a term that clearly distinguishes between these two stimuli; RN is fundamentally random noise while harmonic complexes are perfectly deterministic signals. Thus, we hypothesize that the noisiness dimension may have been an important perceptual attribute in determining the non-equivalence of RN and deterministic signals. Noisiness may have played a role in other experiments as well. In experiments 6A and B, harmonic complexes were compared with frozen noise repetition, and with AM noise, respectively. In experiment 6A, harmonic complexes were found to be essentially equivalent to frozen noise repetition, but not to AM noise (experiment 6B). Frozen noise is essentially a deterministic signal; although it makes use of an arbitrary noise waveform, it is repeated exactly. AM noise is continuous random noise whose amplitude Xuctuates periodically. Thus, AM noise has the quality of noisiness, while frozen noise is perfectly periodic and deterministic. We hypothesize that the results of these experiments reXect the “noisy” quality of AM noise, not shared by harmonic complexes (there is no generalization from AM noise to harmonic signals), and the shared, deterministic nature of frozen noise and harmonic complexes accounts for their equivalence. 5.1. The roles of envelope and waveform periodicity The evidence from these experiments supports the conclusion that periodicity is the stimulus dimension that determines the characteristic inverted “V” shaped generalization gradients in most of these experiments; i.e., that the perceptual dimension that we identify as “pitch-like” is evoked by the periodic nature of the stimuli. The repetition rate or period of the conditioning signal is determined and is then used as a criterion for

19

evaluating the repetition rate or period of the test stimuli. Experiments 4E and 4F, with harmonic components added at random phase angles, were designed to test the idea that the repetition period could control the response in this way only if it were robustly encoded; i.e., only if the envelopes were impulsive. The experiment was not successful because it was impossible to create a periodic envelope with a very small magnitude. Thus, a relatively narrow range of weak envelope magnitudes could be created, and we could not eVectively eliminate an envelope cue for periodicity by randomizing the starting phases of the components. When animals were conditioned to a zero-phase, impulsive envelope signal and tested using random phase signals (experiment 4E), a relatively broad generalization gradient peaking at 100 Hz resulted. When animals were conditioned to random phase signals and tested using zero-phase signals, a monotonically increasing function of repetition rate resulted. We have interpreted these results as arising from an envelope only weakly encoded. When the envelope is robustly encoded during conditioning (experiment 4E), a process is activated that Wnds the periodicity in the test signals, even though the envelopes of test signals are only weakly represented. However, when the conditioning signal lacks a robust envelope (experiment 4F), a periodicity analyzer is not evoked, and other features of the stimuli control the response.

6. Summary Classical respiratory conditioning and stimulus generalization methods were used to evaluate the nature of pitch perception in goldWsh. GoldWsh are interesting because peripheral frequency selectivity is coarse and cannot have the same explanatory roles for goldWsh that it does for humans and other mammals. The preponderance of evidence suggests that goldWsh have complex pitch perception that is based on the evaluation of periodicities or repetition in the peripheral neural representation of complex stimuli (Fay, 1994, 1995). Periodicity is robust with respect to the presence or absence of individual frequency components, and it is shown that the same is true for behavior consistent with pitch perception. Thus, the pitch of a given signal does not change even when the original signal has been greatly degraded. However, it is not clear that a prominent envelope with a high peak factor is required for goldWsh to encode periodicity. Complex signals synthesized with random starting phases (or with periodic, frozen noise segments) had reduced envelope amplitudes but still had pitch values to goldWsh. Envelope periodicity without Wne structure periodicity (AM noise) is not suYcient to evoke pitch perception. Elimination of periodicity by mistuning multiple components eliminates pitch perception while only slightly modifying the spectrum, suggesting that pitch

20

R.R. Fay / Hearing Research 205 (2005) 7–20

perception does not arise from processing the spectrum alone. For human listeners, pure tones, RN, and harmonic complexes are very diVerent signals that can be made similar through a common pitch value. For goldWsh listeners, these three types of signals are perceived as fundamentally dissimilar even when they have the same pitch as judged by human listeners. Nevertheless, pure tones and harmonic complexes, at least, can be said to have pitch for goldWsh. Pitch perception and the pitch of the missing fundamental appear to be primitive or common features of vertebrate auditory systems, having been demonstrated in goldWsh, human listeners (e.g., Schouten, 1970), cats (HeVner and WhitWeld, 1976), rhesus macaques (Tomlinson and Schwartz, 1988), and European starlings (Cynx and Shapiro, 1986). Although the neural mechanisms or causes of pitch perception may be diVerent in these animal groups, the functional properties and consequences of pitch perception appear to be similar.

Acknowledgments This research was supported by a research grant from NIH/NIDCD (1 R01 DC005970), and by the Parmly Hearing Institute of Loyola University Chicago. Thanks to Stan Sheft for a Matlab program that computes envelope and for other suggestions. Thanks to Bill Shofner for making and sharing rippled noise. Thanks to Jim Collier for help with graphics.

References Brantly, R.K., Bass, A.H., 1994. Alternative male spawning tactics and acoustic signals in the plainWn midshipman Wsh, Porichthys notatus Girard (teleostei, Batrachoididae). Ethology 96, 132–213. Crawford, J.D., 1997. Feature detection by auditory neurons in the brain of a sound producing Wsh. J. Comp. Physiol. A 180, 439–450. Cynx, J., Shapiro, M., 1986. Perception of missing fundamental by a species of songbird (Sturnus vulgaris). J. Comp. Psychol. 100, 356–360. De Cheveigné, A., 2005. Pitch perception models. In: Plack, C., Oxenham, A., Fay, R., Popper, A. (Eds.), Pitch: Neural Modeling and Perception. Springer Verlag, New York. Fay, R.R., 1969. Behavioral audiogram for the goldWsh. J. Aud. Res. 9, 112–121. Fay, R.R., 1970. Auditory frequency generalization in the goldWsh (Carassius auratus). J. Exp. Anal. Behav. 14, 353–360. Fay, R.R., 1972. Perception of amplitude-modulated auditory signals by the goldWsh. J. Acoust. Soc. Am. 52, 660–666. Fay, R.R., 1978. Coding of information in single auditory-nerve Wbers of the goldWsh. J. Acoust. Soc. Am. 63, 136–146. Fay, R.R., 1992. Analytic listening by the goldWsh. Hear. Res. 59, 101– 107. Fay, R.R., 1994. Perception of temporal acoustic patterns by the goldWsh (Carassius auratus). Hear. Res. 76, 158–172. Fay, R.R., 1995. Perception of spectrally and temporally complex sounds by the goldWsh (Carassius auratus). Hear. Res. 89, 146–154.

Fay, R.R., 1997. Frequency selectivity of saccular aVerents of the goldWsh revealed by revcor analysis. In: Lewis, E.R., Long, G.R., Lyon, R.F., Narins, P.M., Steele, C.R., Hecht-Poinar, E. (Eds.), Diversity in Auditory Mechanics. World ScientiWc Publishers, Singapore, pp. 69–75. Fay, R.R., 1998a. Auditory stream segregation in goldWsh (Carassius auratus). Hear. Res. 120, 69–76. Fay, R.R., 1998b. Perception of two-tone complexes by goldWsh (Carassius auratus). Hear. Res. 120, 17–24. Fay, R.R., 2000. Frequency contrasts underlying auditory stream segregation in goldWsh. J. Assoc. Res. Otolaryngol. 1, 120–128. Fay, R.R., Popper, A.N., 2000. Evolution of hearing in vertebrates: the inner ears and processing. Hear. Res. 149, 1–10. Fay, R.R., Ream, T.J., 1986. Acoustic response and tuning in saccular nerve Wbers of the goldWsh (Carassius auratus). J. Acoust. Soc. Am. 79, 1883–1895. Fay, R.R., Yost, W.A., Coombs, S., 1983. Psychophysics and neurophysiology of repetition noise processing in a vertebrate auditory system. Hear. Res. 12, 31–55. Fay, R.R., Chronopoulos, M., Patterson, R.D., 1996. The sound of a sinusoid: Perception and neural representations in the goldWsh (Carassius auratus). Aud. Neurosci. 2, 377–392. Guttman, N., 1963. Laws of behavior and facts of perception. In: Koch, S. (Ed.), Psychology: A Study of a Science, vol. 5. McGrawHill, New York, pp. 114–178. Hartmann, W.M., 1988. Pitch perception and the segregation and integration of auditory entities. In: Edelman, W.E., Gall, W.M., Cowan, W.M. (Eds.), Auditory Function. Wiley, New York, pp. 623–645. Hartmann, W.M., 1998. Signals, Sound, and Sensation. Springer, New York p. 368. HeVner, H., WhitWeld, I.C., 1976. Perception of the missing fundamental by cats. J. Acoust. Soc. Am. 59, 915–919. Lu, Z., Fay, R.R., 1993. Acoustic response properties of single units in the torus semicircularis of the goldWsh, Carassius auratus. J. Comp. Physiol. 173, 33–48. McCormick, C.A., 1992. Evolution of central auditory pathways in anamniotes. In: Webster, D., Fay, R., Popper, A. (Eds.), The Evolutionary Biology of Hearing. Springer Verlag, New York, pp. 323– 350. Patterson, R., 1994a. The sound of a sinusoid: spectral models. J. Acoust. Soc. Am. 96, 1409–1418. Patterson, R.D., 1994b. The sound of a sinusoid: Time-interval models. J. Acoust. Soc. Am. 96, 1419–1428. Schouten, J.F., 1970. The residue revisited. In: Plomp, R., Smoorenburg, G.F. (Eds.), Frequency Analysis and Periodicity Detection Hearing. SijthoV, Leiden, pp. 41–54. Schouten, J.F., Ritsma, R.J., Cardozo, B.L., 1962. Pitch of the residue. J. Acoust. Soc. Am. 34, 1418–1424. Shofner, W., Yost, W., 1995. Discrimination of rippled spectrum noise from Xat-spectrum wideband noise by chinchillas. Aud. Neurosci. 1, 127–138. Terhardt, E., 1970. Frequency analysis and periodicity detection in the sensations of roughness and periodicity pitch. In: Plomp, R., Smoorenburg, G.F. (Eds.), Frequency Analysis and Periodicity Detection in Hearing. SijthoV, Leiden, pp. 278–287. Tomlinson, R.W.W., Schwarz, D.W.F., 1988. Perception and the missing fundamental in nonhuman primates. J. Acoust. Soc. Am. 84, 560–565. von Frisch, K., 1936. Über den Gerorsinn der Fische. Biol. Rev. 11, 210–246. Yost, W.A., 1996. Pitch of iterated rippled noise. J. Acoust. Soc. Am. 100, 511–518. Yost, W.A., Hill, R., Perez-Falcon, T., 1978. Pitch and pitch discrimination of broadband signals with rippled power spectra. J. Acoust. Soc. Am. 63, 1166–1173.