i .anguage & Communication, Vol. 14, No. I, pp. 31-36, 1994 Copyright 0 1994 Elsevier Science Ltd Printed in Great Britain. All rights reserved 0271-5309/94 $6.00 + 0.00
Pergamon
HOW MONKEYS
FEEL ABOUT
HOW THEY SEE THE WORLD
MARC D. HAUSER To promote a revolutionary perspective, it is often necessary to paint the world black and white and take a sturdy stance on one side of the theoretical fence. In animal behavior, Sarah Blaffer Hrdy’s (1981) book The Woman That Never Evolved is a good example. Fighting against a male’s eye view of the primate world, Blaffer Hrdy put forth the position that females were the bedrock of primate social groups. This bold move was necessary at the time because no one was paying much attention to female social behavior. Clearly, and as Blaffer Hrdy herself would acknowledge, males and females play important roles, but their roles are different, sometimes complementary and sometimes conflicting. Dorothy Cheney and Robert Seyfarth’s (1990) book How Monkeys See the World provides a revolutionary perspective on animal communication. The dichotomy painted is between signals that merely reflect the animal’s emotional or internal state and signals that represent or refer to biologic~ly meaningful objects and events in the external world. The latter is taken as evidence for the greater involvement of complex mental states in communication. Cheney and Seyfarth are clearly aware of the dichotomy they paint and, in fact, go further than most in providing experimental data for quantifying the relative contribution of emotional and mental states to signal structure. However, they push the mentalistic side of the equation because, I believe, they are at the head of a mini-revolution in animal cognition that is attempting to face, head on, the potentially complex mental lives of our closest relatives. Wishy washy views get us nowhere. Cheney and Seyfarth’s sound experimental work is clearly getting us somewhere and it is because they are pushing a new and challenging view that cognitive scientists must think carefully before they classify all non-human organisms as having the mental power of an aplysia. In responding to Donald Owings’ thoughtful comments on How jockeys See the World, I will develop two points. First, I will suggest that in contrast to Owing’s position, most of the issues he raises wiN require a methodological overhaul. Although there have been important advances in the experimental protocols used by researchers studying animal communication, many of the available tools are insensitive to what may be subtle behavioral and perceptual variation. Second, I suggest that the distinction Owings wishes to make between the emotions and higher level cognition (i.e., ‘conative’ versus ‘cognitive’) may be problematic and, further, that evidence of referentiality need not imply high-level cognition on the part of signaller or perceiver. 1. Methods in animal communication A common approach to understanding the function of animal vocalizations under natural conditions is to record the call of interest, quantify its acoustic morphology, and then select Correspondence relating to this paper should be addressed to Marc D. Hauser, Department of Biological Anthropology and Psychology, Program in Neuroscience, Harvard University, Cambridge, MA 02138, U.S.A. 31
32
MARC D. HAUSER
call exemplars for use in playback experiments. The issues addressed, like the phylogenetic scope of species tested (insects, frogs, birds and mammals), are broad, including tests of individual and species recognition, categorical perception, environmental acoustics, mate choice, and the representational capacity of the signal. Although recent disagreements have emerged over the precise design of playbacks (reviewed in McGregor, 1992), most practitioners would agree that the techniques are powerful and have led to interesting insights into the perceptual world of non-human animals. Within the past 10 years, however, and beginning with Seyfarth, Cheney and Marler’s important study of vervet monkey alarm calls, playback experiments have been used to gain insights into the minds of other species. It is in this domain that I believe our tools are insufficient for the questions we are asking. To make this point concrete, consider the distinction that has been made between signals that convey information about the caller’s emotional state and signals that refer to objects and events in the external environment (see Marler et al., 1992; Owings, 1994). In the original vervet alarm call study, playbacks revealed that each of the primary alarm call types consistently elicited behaviorally appropriate responses among adults. These responses were taken as evidence that the calls provide at least some referential information about the predator encountered. To test for the relative contribution of emotional state to signal structure and function, playbacks of alarm calls varying in amplitude and duration were conducted. It is certainly reasonable to assume that animals who are more aroused would call at a higher amplitude, or call for a longer period of time. But because no independent measure of emotional state (e.g., changes in heart rate, measures of cortisol levels) was collected, it is not possible to verify the validity of the presumed acoustic correlates of arousal. Thus, to determine the relative contribution of emotional and referential components of the calling context, it is necessary to develop methodological tools that quantify the caller’s emotional state and the precise features of the referent. Methodological advances of direct relevance to the emotion-reference problem have recently emerged from studies of food-associated calls in domestic chickens (Evans and Marler, in press) and rhesus monkeys (Hauser and Marler, 1993a, b). I concentrate here on the rhesus call system. On the island of Cayo Santiago, Puerto Rico, rhesus monkeys who discover food often give one of five acoustically distinct calls. Three of these calls (‘warbles’, ‘harmonic arches’ and ‘chirps’) are restricted to the feeding context and are typically given to high-quality food items. The other two calls (‘coos’ and ‘grunts’) are given in food and non-food contexts (e.g., group movement, grooming); in the context of food, provisioned monkey chow is the most common stimulus eliciting call production. Natural observations reveal the following patterns of call production. Adult females call significantly more often in the context of food than do adult males, and females with large matrilines call more often than females with small matrilines. The different call types are produced throughout the day. As stated above, the call type produced appears to depend primarily upon the quality of the food encountered. Call rate is highest in the early morning, when animals are maximally hungry, and declines exponentially thereafter; on Cayo Santiago, feeding stops at approximately 1600 hrs and resumes at approximately 0700. Observational data therefore suggest that call structure covaries with food quality (i.e., the presumed referent) whereas call rate covaries with hunger level (i.e., the presumed emotional or motivational state). To assess more accurately the relative contribution of emotive and referential components
HOW MONKEYS FEEL ABOUT HOW THEY SEE THE WORLD
33
of the calling system, or as Owings would describe it, the conative and cognitive components, field experiments were conducted. The referent was controlled and quantified by presenting individuals with either chow or coconut in amounts of 15 pieces; relative to chow, coconut represents a higher-quality food item. The emotive component was assessed by testing animals in the early morning and in the late afternoon. The morning test session preceded the dispensation of chow and corresponded to the period of maximum hunger for monkeys on Cayo Santiago. The afternoon session started when most animals on the island had consumed approximately 85% of their daily food intake and, thus, corresponded to a period of relative satiation. Results showed that when animals called (i.e., approximately 50% of all trials), only warbles, harmonic arches and chirps were produced in response to coconut. Coos and grunts were given primarily to chow and occasionally to coconut. Moreover, and as observed under natural conditions, call rate was highest in the morning test session. Hauser and Marler (1993b) concluded that for rhesus monkeys on Cayo Santiago, call structure appears to provide information about food, and perhaps food quality more specifically. In contrast, call rate encodes information about the caller’s hunger level. The rhesus monkey data take the emotion-reference problem one small step further by providing an independent measure of arousal and a quantitative description of the referent. However, the precise meaning of the different calls remains unclear. This is where our methods fall short. Consider some of the available techniques for refining call meaning. One could, for example, describe in greater detail the mapping between variation in call structure and variation in the caller’s emotional state and in the eliciting stimulus. This would be informative because, at present, studies of functionally referential signals provide relatively coarse-grained analyses of the sources of acoustic variation (e.g., Marler el al., 1992). The problem with this approach is that it fails to address the perceptual salience of the acoustic variation. Therefore, detailed acoustic and behavioral analyses must be accompanied by perceptual experiments. Under field conditions, playback experiments are the answer, but we require an assay that is sufficiently sensitive to pick up on the salience of subtle acoustic variation. Single-stimulus presentation playbacks have often been used in studies of referential signalling, and the most common assay is the duration of time individuals spend looking toward the playback speaker (e.g., Gouzoules et al., 1984). A second approach, pioneered under field conditions by Cheney and Seyfarth, involves first habituating an animal to one stimulus and then presenting a second, acoustically distinct stimulus. If individuals perceive a difference between the first and second stimulus, dishabituation should follow the habituation series. The problem with both single-stimulus and habituation-dishabituation playbacks is that although they reveal whether the acoustic variation is perceptually discriminable, they tell us relatively less about what the call means to the animal, what it connotes, or what decisions it allows them to make after hearing the call or call sequence. To further develop this commentary, consider once again the rhesus monkey foodassociated call system. Assume for purposes of discussion that in a few years we will be able to account for a relatively high proportion of the sources of acoustic variation in this system and show that certain acoustic features map onto the type of food encountered, whereas other features are closely associated with food quantity, divisibility, and so on. We would now be in an excellent position to present synthesized stimuli that vary precisely in the features that appear to provide critical information about food. Regardless of whether
34
MARC D. HAUSER
we use the single-stimulus or habituation-dishabituation technique, the crucial problem is establishing an assay that will tell us about the perceptual salience (i.e., the discriminability) of the acoustic variation, in addition to its meaning. In preliminary playback experiments that I have conducted with rhesus food-associated calls, individuals typically respond by first looking toward the speaker and then, on some trials, moving toward the speaker. The variation in this assay lies in the duration of looking and in the speed with which individuals approach. Now if, for example, ‘harmonic arches are restricted to small amounts of coconut, how can the orienting or approaching response tell us that monkeys hear the harmonic arch as ‘small bits of coconut have been discovered’? I don’t believe these techniques can tell us the answer to this question. The vervet alarm call system may lend itself to a more direct approach to this question because in contrast to rhesus monkey foodrelated calls or screams (Gouzoules et al., 1984), each of the alarm calls elicits behaviorally distinct responses. Nonetheless, even here we are prevented from refining the meaning of the call beyond the designation of a particular type of predator or a particular type of escape response. Vervets may encode information about the potential threat of a predator (e.g., whether it appears to be hungry or satiated), its class, the number of individuals sighted, and their location in the environment. I cannot, however, think of a technique that will currently allow us to ask whether such apparent information is transparent to the monkeys. In summary, although I disagree with Owings’ conclusion that a shift in theoretical focus can be accomplished in the absence of a methodological overhaul, this disagreement should not be viewed as a pessimistic comment on the future of research in non-human animal communication. In fact, just the opposite conclusion is warranted: much research lies ahead for those interested in quantifying the relative contribution of emotional and referential components of signal production and perception. Because of the availability of sophisticated computer software, it is relatively easy to manipulate features of the eliciting stimulus (e.g., video playbacks: Evans and Marler, 1991) and to synthesize vocalizations to assess the importance of particular acoustic features (Beeman, 1992). Moreover, there are many available techniques for both quantifying and manipulating the animals’ state of arousal and these should be implemented in conjunction with manipulations of the putative referent. But, as in Quine’s treatment of the problem of referential opacity, it seems likely that studies of animal communication will be restricted to relatively coarse-grained analyses of call meaning. 2. Conative and cognitive Cognitive processes are often contrasted with ‘biological’ or ‘emotional’ processes, as Owings does in his essay. I think this might be a misleading distinction. In humans, emotions can evoke mental processes, mental processes can evoke emotions, and mental processes can be about emotions, either experienced or expected. More to the point, there is a growing body of literature on the ‘theory of mind’ in developing children (reviewed in Astington et al., 1988) and in non-human animals (reviewed in Whiten, 1991) that focuses explicitly on whether, and how, individuals come to recognize and respond appropriately to emotional states in others. Thus, for example, what cognitive machinery does an individual require to understand why a child who has just fallen off a swing, gets up as if nothing has happened and walks away, clearly restraining himself from crying? As Harris and Gross (1988) have discussed, such situations require the ability to understand (i) that emotional expression is under voluntary control, (ii) that although the child looks or appears to be fine, he is
HOW MONKEYS FEEL ABOUT HOW THEY SEE THE WORLD
35
actually emotionally upset, and (iii) that if the child is a good actor, others should believe that the child is fine. Thus, theories of mind are intimately wrapped up in theories of emotions and, consequently, making a clean distinction between conative and cognitive may be misleading. In Owings’ essay, ‘conative’ and ‘cognitive’ are used to assess the structure and function of vervet alarm calls. Signals conveying referential information are seen as requiring greater cognitive sophistication. Perhaps, but signals that are functionally referential (sensu Marler et al., 1992) do not require high-level cognition. Such signals must simply pick out a set of objects or events in the environment and reliably predict their occurrence. This is why all of the studies of referential signalling consistently present data on the specificity of the contexts eliciting call production. Thus, Dittus (1984) reports that in toque macaques, 97% of all ‘food calls’ are given to rich food sources. Based on the specificity of the system, he argues that individuals out of sight from the caller are likely to recognize the referent because of the relatively tight correlation between context and call structure. Similarly, in Cheney and Seyfarth’s research, ‘leopard alarms’ are referential, in part, because such calls are primarily heard when leopards are encountered. They also clearly convey information about the caller’s emotional state, and Cheney and Seyfarth have repeatedly emphasized this point in their writings. In terms of specificity, however, animals can of course make mistakes (i.e., false alarm calls in the signal detection theory sense) and can, occasionally, use alarm calls in apparently deceptive ways. But neither the referential specificity of a signal, nor the ability of animals to use such signals dishonestly, necessarily implies complex underlying cognition. Munn’s (1986) work on mixed-species flocks of birds in Peru shows that avian alarm calls are sometimes used deceptively (i.e., calls are given in the absence of a predator) in order to gain a competitive advantage during contests over insect prey. Munn does not invoke a cognitive argument and doesn’t need one. Straightforward cost-benefit thinking is sufficient to explain the phenomenon at hand: the costs of not responding to an alarm call are far greater than the benefits obtained from outcompeting a flock member over an insect. In conclusion, referential signals may represent important adaptations for social animals, especially those living in habitats where signaller and perceiver are often out of sight from each other. This is because referential signals provide information about biologically salient events that listeners can evaluate in the absence of directly observing the signaller or the call-eliciting stimulus. However, evidence of referentiality is not, in and of itself, a passport into the potentially complex minds of signaller and perceiver. This is not to deny the possibility of complex minds in non-human animals (sensu Griffin, 1992). But the passport into the mind will come from designing and implementing experiments that assess how animals use referential signals to manipulate their socioecological world. Cheney and Seyfarth have given us our first visa into the minds of a foreign relative. And Owings has added the important point that we consider how emotions both guide and influence the cognitive processes of such relatives. Acknowledgements-For the opportunity to learn about the minds of another species, and their emotions, I would like to thank Dorothy Cheney and Robert Seyfarth. They have certainly shaped my own mind in important ways. For detailed comments on the manuscript, I would like to thank Don Griffin.
REFERENCES ASTINGTON, J. W., HARRIS, University Press, Cambridge.
P. L. and OLSON,
D. R. 1988 Developing Theories ofMind.
Cambridge
36
MARC D. HAUSER
K. 1992 SIGNAL
BEEMAN,
User’s Guide. Engineering
Design,
BLAFFER
HRDY, S. 1981 The Woman That Never Evolved. Harvard
CHENEY,
D. L. and SEYFARTH,
R. M. 1990
Belmont,
communication
EVANS, C. S. and MARLER, P. 1991 On the use of video images on alarm calling. Animal Behaviour 41, 17-26.
GOUZOULES, representational GRIFFIN,
Massachusetts.
How Monkeys See the World. Chicago University Press, Chicago.
DITTUS, W. 1984 Toque macaque food calls: semantic environment. Animal Behaviour 32, 410-471.
EVANS, C. S. and MARLER, to food availability, courtship
Massachusetts.
University Press, Cambridge,
concerning
food distribution
as social stimuli in birds:
audience
in the effects
P. in press. Food-calling by male chickens (Callus gallus) and its relationship and motivational state. Animal Behaviour.
S., GOUZOULES, H. and MARLER, P. 1984 Rhesus monkey (Macaca mulatta) screams: signalling in the recruitment of agonistic aid. Animal Behaviour 32, 182-193.
D. 1992. Animal Mind. Harvard
University
Press,
Cambridge,
Massachusetts.
HARRIS, P. L. and GROSS, D. 1988 Children’s understanding of real and apparent emotion. In Astington, J. W., Harris, P. L. and Olson, D. R. (Eds), Developing Theories of Mind, pp. 295-314. Cambridge University Press, Cambridge. HAUSER, M. D. and MARLER, P. 1993a Food-associated calls in rhesus macaques I. Socioecological factors influencing call production. Behavioral Ecology 4, 194-205.
(Macaca mulatta).
HAUSER, M. D. and MARLER, P. I993b Food-associated calls in rhesus macaques (Macaca mulatta). II. Costs and benefits of call production and suppression. Behavioral Ecology, 4, 206-212. MARLER, Papoucek,
P., EVANS, H., Jurgens,
C. S. and HAUSER, M. D. 1992 Animal signals: reference, motivation or both? In U. and Papoucek, M. (Eds), Nonverbal Vocal Communication: Comparative and Developmental Approaches, pp. 66-86. Cambridge University Press, Cambridge. MCGREGOR, New York. MUNN, OWINGS,
P. 1992 Playback and Studies ofAnimal
Communication: Problems and Prospects. Plenum Press,
C. 1986 Birds that cry ‘wolf’. Nature 319, 143-145. D. 1994 How monkeys
feel about
the world:
a review of How Monkeys See the World. Language
and Communication 14, 15-30. WHITEN, A. 1991. Natural Theories of Mind. Blackwell,
Oxford.