ELSEVIER
Behavioural Processes 35 (1996) 83-91
in birdsong: from behavioural to neuronal responses
Categorization
Martine Hausberger Universite’ de Rennes I, Luboratoire
*, Hugo
Cousillas
d’Ethologie, C.N.R.S. U.R.A. 373. Campus de Beaulieu, 35042 Rennes Cedex, France
Avenue du G&t!ral
Leclerc.
Accepted 10 January 1995
Abstract
This paper reviews some aspects on the J&ceptual processes involved in the categorization of natural sounds, especially in birdsong. Different models have been proposed to account for the simple filtering observed at the peripheral level to the recognition processes, revealed through behavioural responses. Some studies have shown that neurons in some of the motor nuclei (high vocal center) in the brain are specialized towards precise species-specific characteristics, even the bird’s own song. These results indicate a high level of integration, but little is known about intermediate levels. Studies of the perception of natural songs by starlings show that many neurons in field L are selective towards particular features of the songs. Neighbouring neurons tend to show complementary or similar selectivities, determining areas of response. Such studies emphasize the importance of combining ethology and neurophysiology, and of the use of natural sounds to test neuronal selectivity. Keywords:
Starling; Song; Audition; Neurophysiology; Categorization; Birdsong
1. Perceptual processes involved in categorization Although categorical perception sensu strict0 has been established in a limited number of species (Ehret, 1987), categorization of the surrounding environment is a common feature in animals. Without categorization, any object would be perceived as being unique (Roitblat and von Fersen, 1992). This is especially true for social animals that also use subcategories as is shown by examples of individual, 1987). However, the way given objects are kin or species recognition in particular (Snowdon, categorized may differ between species (see review in Thompson, 1995). Our understanding of how
* Corresponding author. 0376-6357/96/$15.00
0
SSDI 03766357(95)00053-4
1996 Elsevier Science B.V. All rights reserved
84
M. Huusbergrr, H. Cousillas/Behauiourctl
Processes 35 (1996) 83-91
animals classify their environment is limited by our knowledge of their perception of the world (Cheney and Seyfarth, 19901. Understanding the perceptual bases of categorization is not simple: as Herrnstein (1985) emphasizes, there is no simple characteristic that is both necessary and sufficient to be able to say that an object belongs to a category, at least as it appears from behavioural responses. Therefore, identifying the underlying perceptual processes alone is a challenging task and implies getting into the neuronal properties involved. For a long time, there has been little or no interest in the perception of complex auditory patterns, partly because of the lack of an adapted technology, but also because of changes in the theories on audition. The ear was considered as corresponding to a bank of band pass filters with specific bandwidths and integration times. The simplest possible stimuli were used so that the ‘hardwired’ characteristics of the system were revealed (Espinoza Varas and Watson, 1989). It has now, however, become obvious that the encoding of complex sounds involves different internal levels of processing. Interesting results are those showing that both chinchillas and humans perceive the basic categories of human speech at the peripheral level (Kuhl, 1987). However, these stimuli do not elicit the same types of reactions in these two species: the basic auditory peripheral capacities may be general (Aslin et al., 1984) but much of the encoding also involves experience and attention. For Espinoza Varas and Watson (1989), “hearing takes place, to a good extent in the brain rather than in the ear”. For these authors, there is a greater involvement of central rather than peripheral mechanisms in the processing of complex patterns by humans.
2. The birdsong model Interestingly, the research on the perception of bird song has led to a similar progression, both from a historical and a theoretical viewpoint. The use of simple sounds to characterize the neurons’ receptive field has been largely predominant until recently. These studies have shown that auditory neurons are characterized at the peripheral level by simple properties and a tonotopic organization (Manley et al., 1991, Saunders et al., 1973). High frequencies are encoded in the proximal part of the basilar papilla and low frequencies in its distal part. Most birds can hear sounds below 16 Hz (Teurich et al., 1984; Kreithen and Quine, 1979) and up to 5-6 kHz (chicken and starling; Dooling et al., 1986) or even 10 kHz (parrakeet; Dooling and Saunders, 1975). However, these data do not account satisfactorily for the complex classifications that birds make of their songs: individual, population or species recognition (e.g. Hausberger, 1993). Like humans for language, most oscines learn their songs and react to learned variations in this complex vocalisation (Balaban, 1988). Different brain structures have been shown to be involved in the development of motor skills (e.g. Konishi, 1985). Much less is known on the perceptual side, despite intriguing questions on recognition processes. Selectivity in song learning and song development in isolated birds are supposed to be based on innate models (‘template’): auditory models to which the young bird would compare its own production (Konishi, 1965, Marler and Peters, 1989). Adult male songbirds often interact using song matching: the bird replies by using the same song type it has just heard (e.g. Armstrong, 1963; Falls et al., 1982). It is a widespread phenomenon and suggests that the birds compare what they hear to some auditory memory corresponding to their own repertoire. In song sparrows, the strength of the behavioural response is correlated with the similarity to the bird’s own song (‘a self concept’,
M. Hausherger, H. Cousillus/ Brhauiourd Processes 35 (1996) 83-91
85
McArthur, 1986). If the theme broadcast is absent from the bird’s repertoire, then it will reply with the theme closest in structure within its own repertoire (Krebs et al., 1981; Schroeder and Wiley, 1983). This form of categorization suggests that there is an auditory reference to which the stimulus is compared before production occurs. Such a reference is the result of different influences, particularly learning, and means that the perception of meaningful auditory complex stimuli has to involve brain structure, as suggested by Espinoza Varas and Watson (1989) for humans. One of the predominant models for pattern analyses is hierarchical taxonomy: the stimuli are decomposed and integrated by a network of filters at different levels. Remez (1987) using the pandemonium model of pattern analysis (Selfridge, 1959) distinguishes three levels with the lower level, the image demon, filtering the extrinsic stimulation, the computational demons filtering the information transmitted by the image demon, and the cognitive demons. To some extent, the connexionist approach taken by Margoliash and Bankes (1993) for bird song comes close to this computational model. These authors suggest that there are different levels of integration from simple responses to frequencies for example at the periphery up to the nucleus ovoidalis, more complex responses and topography of responses in the field L (the equivalent of the auditory cortex of mammals) to very selective responses in the HVc and RA, both areas which are also involved in song production (Williams and Nottebohm, 1985).
3. Different levels of perception in songbirds In fact, the encoding of sounds and the integration processes are probably more complex. In birds, primary auditory neurons have axons projecting in the angularis and magnocellularis nuclei (Boord, 1969; Parks and Rubel, 1975). A similar primary auditory neuron projects in both nuclei so that a divergent network appears. Although a simple tonotopic organisation is present, a certain number of neurons in those nuclei present complex responses to sound stimuli (Sachs and Sinnott, 1978; Sachs et al., 1978; Manley et al., 1985). Neurons activated by a given frequency at low intensity can be inhibited by that same frequency at higher intensities. Responses to complex sounds are one of the characteristics of neurons at different levels of the ascending auditory pathways. In the nucleus ovoidalis, almost half the neurons do not respond to pure tones but are activated or inhibited if some frequencies in the stimulus are present at given intensities (Biederman-Thorson, 1970). In the field L, where all ascendant auditory pathways converge, Biederman-Thorson also shows that most neurons are activated only by complex sounds. Leppelsack and Vogt (1976) even found neurons that responded only to natural sounds. Such selective neurons coexist with ‘generalists’ that encode frequencies in a tonotopic way (Bigalke-Kunz et al., 1987; Miiller and Leppelsack, 1985). The field L has projections on the higher vocal center (HVc), which itself projects on the robustus archistriatalis (RA) in particular (Fig. 1). Both nuclei have been shown to be involved in the production of sound (Konishi, 1985). Therefore, some authors have suggested that such nuclei could be the central point between perception and production. All studies show that auditory neurons are present in these areas which are only activated by species specific structures of sounds, in fact mainly the birds’ own song (Margoliash, 1983,Margoliash, 1986; Mooney and Doupe, 1991; Williams and Nottebohm, 1985; Vicario and Yohay, 1993). Such responses are dependent on experience, since in birds raised with abnormal sounds, these neurons are activated by these abnormal structures and not
86
M. Huusberger. H. Cousillas / Behuvioural Processes 35 (1996) 83-91
Fig. 1. Schematic representation of the auditory, central and vocal pathways. AN: nucleus angularis; DLM: nucleus mesencephalicus lateralis pars dorsalis; DM: dorsomedial part of nucleus intercollicularis; HVc: higher vocal center; L: field L; LL: nucl. lemnisci lateralis; MAN: nucleus magnocellularis anterior neostriatalis; MC: nucleus magnocellularis; OS: superior olive; OV: nucleus ovoidalis; RA: nucleus robustus archistriatalis; X: area X; XIIts: tracheosyringeal part of the nucleus hypoglossus.
by species typical structures (see review in Leppelsack, 1986). These levels typically correspond to those evoked by Espinoza Varas and Watson (1989) for the processing of complex sounds: responses are experience and attention-dependent. Vicario and Yohay (1993) show that the neuronal selectivity toward the birds’ own song differs according to whether the animal is awake or anaesthetised. Despite all these results, we are still far from understanding how the information is encoded and integrated along this pathway. There is a gap in our knowledge between the peripheral level where neurons respond to simple parameters and this high level of complexity which is responding to a birds’ own song. Little is known about the intermediate levels and therefore also about less specialized recognitions, like species specific or song type categorizations.
4. The starling model In this context, we have started to investigate the perception of species-specific songs in the European starling especially in the field L, at the junction of the ascendant auditory pathway and the production centers. Males of this species have a repertoire of whistled songs, some of which are shared universally between different populations all over the world (Adret-Hausberger, 1989). Although these species-specific themes show the same basic characteristics and variation ranges all over the world, they also show local variations in their structure, leading to complex dialects (e.g. Adret-Hausberger, 1983). Song matching using those whistles occurs in all social situations. Playback experiments have shown that the birds respond with the same theme, even if an unfamiliar dialectal variant is broadcast (Adret-Hausberger, 1982). This shows that despite the variations, the bird classify
M. Huusberger,
H. Cousillus / Behauiourul
x7
Processes 35 (1996) 83-91
this song as belonging to the same theme as their own variant of it: both variants belong to the same category. They even generalize to new exemplars. This has led us to become interested in the perceptual processes involved in such a complex process. Since this involves more the perception of species-specific categories than ‘self recognition’, the field L seemed to be an interesting area to investigate, given the first results obtained by Leppelsack and Vogt (1976). We have broadcast natural and synthethized variants of the species specific themes as well as pure tones to awake starlings using both a multicellular and single cell approach. The multicellular approach was based on the mapping technique developed by Hausler (1989). These recordings have shown that different parts of the sound elicit responses in distinct regions within the field L. Whereas it is easy to recognize which precise parameter (frequency, amplitude or duration of the stimulus) may be responsible for the neuronal responses observed using artificial sounds, it is almost impossible to do so with natural stimuli because in this case different parameters can vary simultaneously. To solve this problem, we used the backward correlation technique. Instead of analysing the neuronal activity as a function of the stimulus, we analyse the stimulus as a function of the response. It is a powerful technique that has not been extensively used because it needs very long computation time. We developed in our laboratory a very fast backward correlation technique. In our method, we average all stimuli occurring during the 128 ms (2 ms time bin) preceding each excitation or an inhibition. This computation gives us the characteristics of the most probable stimulus that produced a neuronal response. The result of this backward correlation is expressed as a sonogram, i.e. frequency as a function of time (see for more details Richard et al. in press). The first results we obtained with the single cell approach in field L show that most cells do not respond to pure tones. These neurons are very selective. They are excited or inhibited by frequency modulations, inflection point, if 2 frequencies are present in the sound and on or off responses (Fig. 2). We then used the backward correlation in the multicellular approach and we made backward correlation maps of field L. For this approach, we recorded the neuronal activity at different depths
A
B
C
D
Fig. 2. Backward correlation on unit recordings. These results are expressed as a sonogram (frequency as a function of time). Examples of neuronal selectivity revealed by the backward correlation applied on single cell recordings. A: inflexion point; B: inflexion; C: frequency modulation; D: two frequencies. In each picture, the vertical axis represent the frequencies ranging from 0 to 6 kHz and the horizontal axis the time ( - 128 to 0 ms) before excitation (left side of each picture) and before inhibition (right side).
M. Hausberger,
88
7xa
8Om
8%Q
H. Cousillas / Behauioural
Pm3
9500
Processes
lOfXl0
35 (1996)
lsOO0
83-91
no00
1100
1300
1500
1700
1900
2100
23DD
2500
2700
2900
3100
3300
3500
3700
3900
Fig. 3. Backward correlation map. Each sonogram is the result of the backward correlation computation for each recording site. Each sonogram is repositioned at the place where activity was recorded in feild L. The recording plane is 1 mm left from the mid sag&l plane and 2.5mm from the ear bars. This picture shows the tonotopic organization and the spatial distribution of selectivity for excitation. A: tonotopic organization and pure tone selectivity; B: tonotopic organization and frequency modulation selectivity; C: tonotopic organisation and on reponses; D: on responses.
(100 pm step) in a sagittal plane across field L. Then the backward correlation sonograms were mapped, each sonogram was positioned using the coordinates of each recording site. On these maps, we verified the tonotopic organization of field L that was described by other authors, but we could also characterise the spatial distribution of the selectivities recorded in single cells (Fig. 3). Neighbouring groups of neurons tended to show similar or complementary types of selectivities.
M. Huushrrger.
H. Cousillus/
Behmiourul
Processrs
35 (1996) 83-91
x9
These results show that the backward correlation technique is a very powerful technique for the study of the species-specific song recognition in the central auditory and vocal nuclei.
5. Conclusion The results we have obtained show that the understanding of the perception of complex sounds, as those involved in the social life of animals, involves detailed analyses at different levels. We will only understand how the neurons can be selective to the bird’s own song in the HVc when we know how the information is encoded in the field L. Our results also emphasize the importance of using species-specific sounds in the neurophysiological experiments if we are to find out how animals perceive meaningful complex sounds. As Park and Dooling (1985) emphasize, “although studies with artificial sounds have led to insights into the mechanisms of hearing for both humans and animals, perceptual studies with biologically relevant stimuli are more likely to reveal an important role for other factors such as attention, motivation, learning and memory”. The responses of HVc neurons typically involve such aspects, apparently a general feature of complex sound perception (Espinoza Varas and Watson, 1989). Neurophysiologists like Schlfer et al. (1992) defend another view: receptive fields of neurons can be revealed at least as well by using artificial sounds and natural signals are not necessary. In fact, both those seemingly divergent views are true. Ethologists are more interested in understanding the causal aspect of behaviour, like perception, and therefore are interested in what the neuron will respond selectively to in this natural signal. Our backward correlation method is an example: we can show that one neuron selectively responds to one particular combination of parameters among the set of stimuli. The resulting sonogram cannot however be assimilated to the ‘receptive field’. As Brothers and Ring (1992) also mention, neurons can be selective toward particular features of the signal (sound or image) but also respond (probably less strongly) to other parameters (either natural or artificial). A much larger set of stimuli would be necessary if we were to define the receptive fields. But it is difficult to assume that we would be able anyway to cover the whole range of possible parameters. Being able to show that neurons extract selectively particular features from natural stimuli and that the degree of this selectivity depends on the level investigated is certainly an important step towards understanding recognition and categorization processes in animals and humans. Birdsong is a model in which the gap between ethology and neurophysiology is narrowing (Kennedy, 1992).
Acknowledgements We are indebted to E. Leppelsack and H.J. Leppelsack for collecting data on starlings in their neurophysiological set-up and for fruitful discussions, and to J.P. Richard for his important help in analysing the neuronal responses. We are thankful to A. Cloarec for improving the English.
References Adret-Hausberger, M., 1982. Social influences on the whistled songs of starlings. Behav. Ecol. Sociobiol., I 1: 241-246. Adret-Hausberger, M., 1983. Variations dialectales des sifflements de I’Ctoumeau sansonnet kdentaire en Bretagne., Z. Treipsychol., 62: 55-7 1.
90
M. Huusberger, H. Cousilhs/BehauiournI
Processes 35 (1996) 83-91
Adret-Hausberger, M., 1989. The species-repertoire of whistled songs in the European starling: species specific characteristics and variability. Bioacoustics, 2: 137-162. Armstrong, E.A., 1963. The study of bird song. Oxford University Press. Aslin, R.N., Pisoni, D.B. and Jusczyk, P.W., 1984. Auditory development and speech perception in infancy. In: M.M. Haith and J.J. Campo (Editors), Infancy and the Biology of Development, New York: Wiley. Balaban, E., 1988. Bird song syntax: learned intraspecific variation is meaningful. Proc. Nat!. Acad. Sci. USA, 85: 3657-3660. Biederman-Thorson, M., 1970. Auditory responses of units in the ovoid nucleus and cerebrum (field L) of the ring dove. Brain Res., 24: 247-256. Bigalke-Kunz, B., Rubsamen, R. and Diirrscheidt, G.J., 1987. Tonotopic organization and functional characterization of the auditory thalamus in a songbird, the European starling. J. Comp. Physiol., 161: 255-265. Boord, R.L., 1969. The anatomy of the avian auditory system. Ann. NY Acad. Sci., 167: 186- 198. Brothers, L. and Ring, B., 1992. A neuroethological framework for the representation of minds. J. Cognitive Neurosci., 4: 107-l 18. Cheney, D.L. and Seyfarth, R.M., 1990. How monkeys see the world. Inside the mind of another species. Chicago, Univ. Chicago Press. Dooling, R.J., Okanoya, K., Downing, J. and Hulse, S., 1986. Hearing in the starling (Srurnus uulgaris): absolute thresholds and critical ratios. Bull. Psychol. Sot., 24: 462-464. Dooling, R.J. and Saunders, J., 1975. Hearing in the parakeet (Melopsifucus rendulutus): absolute thresholds, critical ratios, frequency difference limens and vocalizations. J. Comp. Physiol. Psychol., 88: l-20. Ehret, G., 1987. Categorical perception of sound signals: facts and hypotheses from animal studies. In: S. Hamad (Editor) Categorical perception. Cambridge, England: Cambridge University Press, pp. 30 1-33 1. Espinoza Varas, B. and Watson, C.S., 1989. Perception of complex auditory patterns by human. In: R.J. Dooling and S.H. Hulse (Editors), The comparative psychology of audition, perceiving complex sounds, Lawrence Erlbaum Associates, Publishers, Hillsdale, New Jersey, pp. 67-94. Falls, J.B., Krebs, J.R. and McGregor, P.K., 1982. Song matching in the great tit (Parus major): the effect of similarity and familiarity. Anim. Behav., 30: 997-1009. Hausberger, M., 1993. How studies on vocal communication in birds contribute to a comparative approach of cognition. Etologia, 3: 171-185. Hhsler, U., 1989. Die strukturelle und funktuionelle Organisation der Hijrbahn im caudalen Vorderhim des Staren. Dissertation, Institut fur Zoologie, Technishe Universitat Miinchen. Hermstein, R.J., 1985. Riddles of natural categorization. Philos. Trans. R. Sot. London Sci. B, 308: 129-144. Kennedy, J.S., 1992. The new anthropomorphism. Cambridge Univ. Press. Konishi, M., 1965. The role of auditory feedback in the control of vocalization in the white-crowned sparrow. Z. Treipsychol. 22, 770-783. Konishi, M., 1985. Birdsong: From behavior to neuron. Ann. Rev. Neurosci., 8: 125- 170. Krebs, J.R., Ashcroft, R. and van Orsdol, K., 1981. Song matching in the great tit (Purus major). Anim. Behav., 29: 918-923. Kreithen, M.L. and Quine, D.B., 1979. Infrasound detection by the homing pigeon. A behavioral audiogram. J. Comp. Physiol., 129: 1-4. Kuhl, P.K., 1987. The special-mechanisms debate in speech research: categorization tests on animals and infants. In: S. Hamad (Editor) Categorical perception. Cambridge, England: Cambridge University Press, pp. 355-386. Leppelsack, H.J., 1986. Critical periods in bird song learning. Acta Otolaryngol., Suppl. 429: 57-60. Leppelsack, H.J. and Vogt, M., 1976. Responses of auditory neurons in the forebrain of a songbird to stimulation with species-specific sounds. J. Comp. Physiol., 107: 263-274. Manley, G.A., Gleich, O., Leppelsack, H.-J. and Oeckinghaus, H., 1985. Activity patterns of cochlear ganglion neurones in the starling. J. Comp. Physiol., 157: 161-181. Manley, G.A., Kaiser, A., Brix, J. and Gleich, O., 1991. Activity patterns of primary auditory-nerve fibres in chickens: Development of fundamental properties. Hearing Res., 57: 1-15. Margoliash, D., 1983. Acoustic parameters underlying the responses of song specific neurons in white-crowned sparrow. J. Neurosci., 3: 1039-1057.
M. Hausherger,
H. Cousillas / Behauiourol
Processes
35 (1996)
83-91
91
Margoliash, D., 1986. Preference for autogenous song by auditory neurons in a song system nucleus of the white-crowned sparrow. J. Neurosci., 14: 1643- 1661. Margoliash, D. and Bankes, S.C. 1993. Computations in the ascending auditory pathway in songbirds related to song learning. Amer. Zool., 33 (1): 94-103. Marler, P. and Peters, S., 1989. Species differences in auditory responsiveness in early vocal learning. In: R.J. Dooling and S.H. Hulse (Editors), The comparative psychology of audition, perceiving complex sound, Lawrence Erlbaum Associates, Publishers, Hillsdale, New Jersey, pp. 243-273. McArthur, P.D., 1986. Similarity of playback songs to self song as a determinant of response strength in song sparrow Melospiza melodia. Anim. Behav., 34: 199-207. Mooney, R. and Doupe, A.J., 1991. Neurobiology of birdsong: circuits, synapses and development. Discuss. Neurosci., 7: loo-Ill. Miiller, C.M. and Leppelsack, H.J., 1985. Feature extraction and tonotopic organization in the avian auditory forebrain. Exp. Brain Res., 59: 587-599. Park, T.J. and Dooling, R.J., 1985. Perception of species-specific contact calls by budgerigars (Melopsittacus undulatus). Anim. Learn. Behav., 14: 359-364. Parks, T.N. and Rubel, W.E., 1975. Organisation and development of the brain stem auditory nuclei of the chicken: Organization of projections from N. magnocelluhis to N. laminaris. J. Comp. Neurol., 164: 435-448. Remez, R.E., 1987. Neural models of speech perception: a case history. In: Categorical perception: The groundwork of cognition. In: S. Hamad (Editor), Cambridge Univ. Press, Cambridge, pp, 199-225. Roitblat, H.L. and von Fersen, L., 1992. Comparative cognition. Representations and processes in hearing and memory. Ann. Rev. Psychol., 43: 671-710. Sachs, B.M., Woolf, N.K. and Sinnott, J.M., 1978. Response properties of avian auditory-nerve fibers and medullary neurons. Symposium on neuroanatomy and neurophysiology of the auditory system. 7 VI): 710-713. Sachs, M.B. and Sinnott, J.M., 1978. Responses to tones of single cells in Nucleus Magnocellularis and Nucleus Angularis of the redwing blackbird ( Agelaius phoeniceus). J. Comp. Physiol., 126: 347-36 1. Saunders, J., Coles, R.B. and Gates, G.R., 1973. The development of auditory evoked response in the cochlea and cohlear nuclei of the chick. Brain Res., 63: 59-74. Schtifer, M., Riibsamen, R., Diirrscheidt, G.J. and Knipschild, M., 1992. Setting complex tasks to single units in the avian auditory forebrain. II: Do we really need natural stimuli to describe neuronal response characteristics? Hearing Res., 57: 23 I-244. Schroeder, D.J. and Wiley, R.H., 1983. Communication with repertories of song themes in tufted titmice. Anim. Behav., 3 1: 1128-l 138. Selfridge, O.G., 1959. Pandemonium: A paradigme for learning. In: Mechanisation of though processes. London, H.M. Stationery Office, pp. 5 II-53 1. Snowdon, C.T., 1987. A naturalist view of perception. In: S. Hamad (Editor), Categorical perception: the groundwork of cognition, Cambridge Univ. Press, Cambridge, pp. 332-354. Teurich. M., Langner, G. and Scheich, H., 1984. Infrasound responses in the midbrain of the guinea fowl. Neurosci. Lett., 49: 81-86. Thompson, R.K.R., 1995. Natural and relational concepts in animals. In: H. Roitblat and J.A. Meyer (Editors). Comparative approaches to cognitive science. Cambridge, MA : Bradford, MIT Press. Vicario, D.S. and Yohay, K.H., 1993. Song-selective auditory input to a forebrain vocal control nucleus in the zebra finch. J. Neurobiol., 24 (4): 488-505. Williams, H. and Nottebohm, F., 1985. Auditory responses in avian vocal motor neurons: a motor theory for song perception in birds. Science, 229: 279-282.