When it's good to signal badness: using objective measures of discriminability to test the value of being distinctive

When it's good to signal badness: using objective measures of discriminability to test the value of being distinctive

Animal Behaviour 129 (2017) 113e125 Contents lists available at ScienceDirect Animal Behaviour journal homepage: www.elsevier.com/locate/anbehav Wh...

1MB Sizes 0 Downloads 2 Views

Animal Behaviour 129 (2017) 113e125

Contents lists available at ScienceDirect

Animal Behaviour journal homepage: www.elsevier.com/locate/anbehav

When it's good to signal badness: using objective measures of discriminability to test the value of being distinctive Timothy J. Polnaszek*, Tricia L. Rubi 1, David W. Stephens Department of Ecology, Evolution and Behavior, University of Minnesota, St Paul, MN, U.S.A.

a r t i c l e i n f o Article history: Received 19 September 2016 Initial acceptance 1 November 2016 Final acceptance 19 April 2017 MS. number: A16-00826R Keywords: aposematism decision making discrimination distinctiveness predation signalling

The hypothesis that prey organisms can reduce the risk of predation by overtly signalling their unprofitability, or aposematism, has a long history in behavioural and evolutionary biology. To fully understand this longstanding idea, we need to measure and manipulate traits of aposematic prey, such as their distinctiveness from other prey, from the perspective of the potential predator. Specifically, we need measurements that are not anthropomorphic and that are based on the principles of discrimination developed by psychophysicists. This paper utilizes an experimentally tractable measure of discriminability based on signal detection theory as originally studied by psychophysicists. In addition, we develop and experimentally test a model to characterize the predator avoidance advantages derived from being distinct from other prey. By experimentally varying discriminability (and thus distinctiveness) we find that increased discriminability does confer a predator avoidance advantage, but the extent of this effect depends on the unprofitability of prey and the relative frequency of unprofitable prey. © 2017 The Association for the Study of Animal Behaviour. Published by Elsevier Ltd. All rights reserved.

As any student of introductory biology knows, many unpalatable, dangerous or unprofitable prey animals produce conspicuous signals that appear to warn potential predators to stay away. The hypothesis of aposematism has an unimpeachable pedigree in evolutionary biology. Historians trace its origins to an exchange between Alfred Russel Wallace and Charles Darwin (see Ruxton, Sherratt, & Speed, 2004, for a historical review). In the time since Wallace and Darwin, the problem of aposematic signalling has continued to attract the attention of behavioural and evolutionary biologists (e.g. Arenas, Walter, & Stevens, 2015; Barnett, Bateson, & Rowe, 2014; Barnett, Scott-Samuel, & Cuthill, 2016; Guilford, 1988; Leimar, Enquist, & Sillen-Tullberg, 1986; Mappes, Marples, & Endler, 2005; Speed & Ruxton, 2005). These modern students of aposematism have focused, for example, on the problem of how predators learn to avoid unprofitable prey (e.g. Gittleman, Harvey, & Greenwood, 1980; Roper & Wistow, 1986), or on the question of how prey gregariousness influences the evolution of aposematism (e.g. Gamberale & Tullberg, 1998; Rowland, Ruxton, & Skelhorn, 2013).

* Correspondence and present address: T. J. Polnaszek, Department of Ecology and Evolutionary Biology, University of Arizona, 1041 E. Lowell Street, BioSciences West #310, Tucson, AZ 85761, U.S.A. E-mail address: [email protected] (T. J. Polnaszek). 1 T. L. Rubi is now at the Department of Psychology, University of Michigan, 530 Church Street, Ann Arbor, MI 48109, U.S.A.

A primary goal of aposematism research is identifying the benefits prey obtain by conspicuously signalling to predators, benefits such as improved predator learning or memory (reviewed in Ruxton et al., 2004). Such benefits are necessary to counterbalance the potential costs of alerting predators to one's presence. These benefits can generally be divided into two categories: those benefits derived from conspicuously contrasting with the back€ m, ground (e.g. Gittleman & Harvey, 1980; Ham, Ihalainen, Lindstro & Mappes, 2006; Roper & Redston, 1987) or those derived from contrasting with other prey items, termed ‘distinctiveness’ (e.g. Merilaita & Ruxton, 2007). We focus on the value of distinctiveness, which has traditionally received less experimental attention as an explanation for aposematism than the value of conspicuously contrasting with background (but see Merilaita & Ruxton, 2007; Sherratt, 2002; Sherratt & Beatty, 2003; Sherratt & Franks, 2005). The discriminability between prey types represents an important problem for predators attempting to simultaneously find palatable prey and avoid aposematic prey, and these predators are presumably the major selective force in aposematic systems. When biologists recognize an instance of aposematic signalling, they nearly always do so because the prey animal in question is conspicuously distinct from other prey from a human perspective. This implicit anthropocentrism is problematic because in order to truly understand aposematic signalling systems, we need to understand distinctiveness from the perspective of the intended receivers. Empirical studies of aposematism, however, typically do

http://dx.doi.org/10.1016/j.anbehav.2017.05.009 0003-3472/© 2017 The Association for the Study of Animal Behaviour. Published by Elsevier Ltd. All rights reserved.

114

T. J. Polnaszek et al. / Animal Behaviour 129 (2017) 113e125

not measure distinctiveness or discriminability based on the perception of predators. Moreover, several studies that have measured distinctiveness of aposematic prey and their mimics, from the predator's perspective, indicate that the predator's perception of prey does not match perfectly with our own (Dittrich, Gilbert, Green, Mcgregor, & Grewcock, 1993; Green et al., 1999). Even if we were to accept that human sensory abilities map crudely to the abilities of many predators, we still have the problem that ‘distinctiveness’ is simplest to describe categorically (i.e. prey types are distinctive or not), and this results in an imprecise description of predator recognition. For example, to ask how distinctive a prey item must be to obtain the advantages of aposematism, we need to manipulate the degree of discriminability, and to do this we need to measure it quantitatively. This paper presents a laboratory simulation of aposematic signalling that uses a quantitative, nonanthropocentric measure of discriminability derived from the theory of signal detection (Green & Swets, 1966). There exists a wide range of literature in psychophysics and experimental psychology that models and tests nonhuman animal discrimination using signal detection theory (reviewed in Alsop, 1998; Blough, 2001). We extend these approaches by structuring the consequences of receiver actions (correct accepts, false alarms, etc.) after the ecological problem of foraging in a system with unprofitable prey. Our goal is not to measure or model discrimination capability, per se, but to identify the protective effects of predator discrimination (i.e. the protection experienced by prey) given a specific arrangement of costs and benefits. First, we develop an experimentally tractable model of aposematic signalling, and briefly review the basic ideas of signal detection theory. A BASIC APOSEMATISM MODEL Consider a predator encountering a prey item. The prey item may be good, yielding a value (or net benefit to the predator) of vgood. Or the prey may be bad, yielding value vbad. Upon encounter, the predator can choose to attack or ignore the prey item. If it ignores the prey item, we assume it pursues an alternative activity (e.g. looking for more prey elsewhere) that yields value valt. We assume that vgood > valt > vbad. Now, to reduce the number of variables, we rescale the three v s such that vgood ¼ 1, valt ¼ 0 and vbad ¼ b such that b represents the relative unprofitability (or ‘badness’) of the bad prey type (i.e. the units are now based on the difference between vgood and valt). In addition, we let p be the proportion of prey in the environment that are bad. Finally, we assume that the prey signals its type via an externally detectable pattern. A signal indicating the good type is denoted by Sþ, while a signal indicating the bad type is denoted by S. A ‘signal follower’ attacks the prey if it observes Sþ, but it adopts the alternative activity if it observes S. The expected gains due to signal following are therefore Yfollow ¼ 1  p per encounter; the signal follower obtains one unit when it observes pattern Sþ and attacks a good prey item (which occurs on a portion 1  p of encounters), and it obtains 0 units when it encounters pattern S because it exploits the alternative resource gaining 0 units when it observes S. Assuming the predator can discriminate perfectly between the Sþ and S states of the signal, the payoff for following the signal is Yfollow ¼ (1  p). While this assumption of perfect discrimination (i.e. prey types are perfectly distinct) may be reasonable for some forms of aposematic signalling, ultimately it represents a problem. To fully explore the hypothesis that aposematic signalling is valuable to unprofitable signallers because it distinguishes them from other prey, we need to systematically vary discriminability. Research on gradations of discriminability (such as studies on imperfect mimics and aposematic prey) often use qualitative categories of discriminability

(e.g. good versus poor replicas: Caley & Schluter, 2003; Schmidt, 1958; or a graded continuum of an arbitrary trait: Duncan & Sheppard, 1965). Another study goes a step further, manipulating discriminability on a finer scale (McGuire, Van Gossum, Beirinckx, & Sherratt, 2006), but does not quantify discriminability based on measures of perception. To address the problem of manipulating discriminability, and thus vary distinctiveness from the profitable prey, we use ideas from the psychophysical framework of signal detection theory. Although several recent papers cover signal detection, we briefly review the key points below. Signal Detection Theory Signal detection theory came to prominence in psychophysics in the 1960s and 1970s (Egan, 1975; Green & Swets, 1966; Swets, 1996). More recently, several behavioural ecologists have recognized the significance of signal detection theory for behavioural ecology (e.g. Getty, Kamil, & Real, 1987; Lynn, Cnaani, & Papaj, 2005; McGuire et al., 2006; Stephens, 2007; Wiley, 1994, 2013). We illustrate the basic ideas with our system of profitable and unprofitable prey in mind. Say that a predator encounters a profitable prey item and observes a signal of the good type (Sþ). Noise causes stochastic variation in the perceived signal magnitude, and as a result, its sensory apparatus draws a sample from a normal distribution of possible signal magnitudes. If, instead, the prey item is unprofitable and the predator observes a signal of the bad type (S), the predator draws a sample from a different normal distribution (see Fig. 1 for a graphical illustration). The predator must decide which of the distributions its observation came from and act accordingly. These two distributions, specifically the distance separating them and their variances, tell us whether the predator faces an easy discrimination problem (little overlap) or a hard discrimination problem (high overlap). Although it is somewhat ‘nonstandard’ (for signal detection theory), we assume that the distribution for the bad prey is shifted to the right of the distribution for the good prey. The theory assumes that the animal ‘discriminates’ by setting a threshold x) such that the animal accepts the encountered item if it observes a signal intensity less than x) and rejects it otherwise. Traditionally, students of signal detection use the variable d0 to describe the difficulty of the discrimination problem. The parameter d0 measures the separation between the two signal distributions in standard deviation units (assuming distributions with equal variance), which provides a dimensionless measure of how difficult the discrimination problem is for the animal. If d0 is small (say, under 0.1), the animal faces a very difficult discrimination problem, and if d0 is large (say, 3.0), the animal faces a very easy discrimination problem. We can estimate d0 from the frequency of attack when we present Sþ and compare this to the frequency of attack when we present S. Mathematically, we calculate d0 for an experienced subject as

d0 ¼ Z½1  PðAttackjSÞ  Z½1  PðAttackjSþÞ

(1)

where Z is the inverse function of the standard normal distribution (Gescheider, 1997; but note our S distribution is shifted right). This equation quantifies the difficulty of the discrimination problem (and the separation of the two stimulus distributions). We remark that alternative estimates of discriminability are possible, in particular the use of ROC curves, and these alternatives can provide a more complete description of the discrimination process (Macmillan & Creelman, 2005). Our goal, however, is to generalize about the effects of ‘discriminability’, and its interactions with payoffs and prey prevalence, on predator decisions and so we use the simpler d0 technique because of its simplicity and broad applicability. Note our use of d0 assumes that the signal and noise

T. J. Polnaszek et al. / Animal Behaviour 129 (2017) 113e125

d’

115

Aposematic prey (S-)

High-quality prey (S+)

P(correct rejection)

P(incorrect rejection)

x*

Stimulus value, x

Figure 1. Signal detection theory, as applied to our system. The predator draws a sample from the left-hand normal curve when it observes a high-quality prey (Sþ), and draws a sample from the right-hand curve when it observes a low-quality prey (S). The perceived value relative to the discrimination threshold (x)) determines the predator's response. The amount of overlap between the curves indicates the likelihood of errors, and thus the difficulty of the discrimination problem (quantified by d0 , equivalent to the distance between the means of the distributions; see Macmillan & Creelman, 2005).

distributions are normal and have equal variance (see Supplementary Material 1 for more on these assumptions, and for example curves generated by the current experiment). Signal detection theory allows us to calculate the optimal attack threshold from our three parameters: (1) the benefit associated with the foraging outcomes (vgood, valt and vbad), recall that vbad ¼ b and represents the relative unprofitability of the bad prey); (2) the relative frequency of the bad prey type (p); and (3) the dimensionless measure of the difficulty of discriminating between Sþ and S (d0 ). This value is optimal in the sense that it maximizes expected gains per encounter. For our parameters, the optimal threshold is

x* ¼

  d0 1 1p þ 0 ln pb 2 d

(2)

(see Supplementary Material 2 for derivation). From equation (2), one can calculate the predicted frequencies of ‘false alarms’ (defined as mistakenly attacking the bad prey) and ‘hits’ (defined as correctly attacking the good prey). We can calculate these values because (by assumption) the two distributions are standard normal distributions with the good prey distribution centred at zero and bad prey distribution centred at d0 . The probability of a correct attack is therefore F(x*) and the probability of a false alarm is F(x*  d0 ), where F represents the cumulative distribution function. We are especially interested in the frequency of false alarms (P(False Alarm)) because it measures the risk experienced by our hypothetical aposematic prey item. The frequency of false alarms is also important in the context of mimicry, especially Batesian mimicry, because false alarms indicate predator attacks on the ‘model’ species. Unprofitable model species are not afforded protection from predators if these rates are high.

and we should see a lower P(False Alarm) when the bad type is common (high p) because the predator should be biased to ignore most prey. However, when the discrimination task is easy (high d0 ), P(False Alarm) is low for all levels of p because rare variants are easier to identify. Predator discrimination between good and bad types (d0 ) P(False Alarm) generally decreases as d0 increases. However, P(False Alarm) may increase with d0 at low d0 , especially when bad prey are common and the unprofitability value b is large (Holen & Johnstone, 2006). This result is driven by a switch in predator behaviour from ignoring all prey to attempting to discriminate between prey types and making some errors in the process (see Oaten, Pearce, & Smyth, 1975). Relative unprofitability (badness) of the bad type (b) Increasing the badness parameter (b) decreases P(False Alarm). High unprofitability creates more extreme penalties for incorrect €m, attack decisions, reducing predation (Alcock, 1970; Lindstro Alatalo, & Mappes, 1997). This reduction in predation on the unprofitable prey is largest at low d0 values because as the predator approaches perfect discrimination the false alarm rate approaches zero. Relative frequency of the bad type (p) Increasing the relative frequency of the bad type (p) decreases (P(False Alarm)). The base rate of possible states (here prey types) is often a key parameter in models of signalling and signal following (e.g. McLinn & Stephens, 2006; Polnaszek & Stephens, 2014), and in € m, Alatalo, studies of aposematism and mimicry (e.g. Lindstro Lyytinen, & Mappes, 2001; Speed & Turner, 1999). Overview of the Experiments

Predictions of Our Model Fig. 2 illustrates the following qualitative predictions. Interaction between the relative frequency of the bad type (p) and predator discriminability (d0 ) The model predicts a fairly strong p by d0 interaction. Generally, we should see a higher P(False Alarm) when the bad type is rare (low p) because the predator should be biased to attack most prey,

The studies reported here explore this model experimentally. We use captive blue jays, Cyanocitta cristata, as experimental ‘predators’ in operant-style discrimination tasks. We proceed in two stages. Experiment 1 validates a new technique (noise addition) that allows us to manipulate and measure the discriminability of stimulus pairs. Experiment 2 uses the d0 values measured in experiment 1 to test our signal detection-based model of aposematic signalling. Experiment 2 factorially manipulates the three

116

T. J. Polnaszek et al. / Animal Behaviour 129 (2017) 113e125

(a)

P(False Alarm)

(b)

b=1

1 0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0 0

0.5

1

1.5

b=2

1

2

2.5

3

p=0.1 p=0.5 p=0.9

0 0

0.5

1

1.5

2

2.5

3

Discriminability (d’) Figure 2. The relationship between the attack rate on aposematic prey (P(False Alarm) and discriminability (d0 ) for a wide range of parameter values. (a) Predictions for a relatively good bad prey item (b ¼ 1). (b) Predictions for a worse bad prey item (b ¼ 2). The three curves in each panel show the predicted relationships for different frequencies of the bad prey type (p ¼ 0.1, the highest curve; p ¼ 0.5, the intermediate curve; p ¼ 0.9, the lowest curve). Notice that when discrimination is impossible (d' ¼ 0), the model typically predicts attack probabilities of 0 or 1 as one might expect. The intermediate attack probabilities shown for one case in (a) are unstable or ‘knife-edged’ in the sense that any deviation from this specific parameter set will drive the prediction to 0 or 1 attack probabilities. Importantly, attack rate does not always decrease with increasing distinctiveness (see Holen & Johnstone, 2006).

key variables in our model: the relative frequency of the bad type (p), the relative unprofitability (badness) of the bad type (b), and the ease with which predators can discriminate between the good and bad types (d0 ). GENERAL METHODS All housing and experimental procedures were approved by the University of Minnesota Institutional Animal Care and Use Committee (IACUC protocol number 1109A04421). Subjects stayed in captive care after this study, and are currently maintained on IACUC approval number 1408-31752A. Collection of subjects was completed under U.S. Fish and Wildlife Service permit number MB73302-1. Apparatus and Housing During training and the experiment, subjects were housed for 23 h per day in individual Skinner boxes (Fig. 3) and maintained on a 12:12 h light:dark cycle. Water was provided ad libitum. Each box contained a rear perch with a microswitch attached to report the bird's presence or absence to the central computer controlling the experiment. A long perch at the front allowed access to the pecking keys and the food cup. Two pecking keys were arranged side by side at the front of each box: a standard pigeon key (simple push button obtained from Med Associates, Inc., Burlington, VT, U.S.A.) and a ‘large key’, a rectangle of transparent Plexiglas wired to detect pecking responses. The stimulus image was presented on a small monitor directly behind the large transparent key, so that pecking the image itself registered a response. A computer program written in the MED-PC language (Med Associates, Inc.) recorded behaviour and controlled the stimuli and food dispenser. Daily Sessions Experimental sessions began with the start of the light cycle at 0700 hours and ran until 1500 hours. At 1500 hours, the birds were

weighed and moved to an adjacent room for 1 h for cleaning and data collection, then returned to the experimental boxes. In general, the experiments operated in a ‘closed economy’, that is, subjects received all food from the experimental apparatus for the duration of the experiment. Occasionally, it was necessary to alter this economy to ensure that the birds maintained a healthy weight. To avoid overfeeding, the experimental day ended early if a bird had received over 12 g of food before 1500 hours. Similarly, if a bird's weight dropped an unhealthy amount, additional food was provided in the experimental box at the end of the day. EXPERIMENT 1: MANIPULATING PREY DISCRIMINABILITY This experiment validates a technique for manipulating the discriminability parameter d0 . The experiment presents subjects with pairs of images and attempts to manipulate the difficulty of discriminating between the two images in a pair. The basic idea is relatively simple. To create a difficult-to-discriminate pair, we take a single arbitrary image A and alter it slightly to create image A0 . If we make sufficiently small alterations, then we will have a created a pair of images A and A0 that are difficult to discriminate. To create an easy-to-discriminate pair, we make larger alterations to image A to create image A0 . Although one might alter images in several ways, the theory of signal detection itself suggests a simple noise addition mechanism. Computationally, our image A is an ordered collection of the colour attributes (i.e. RGB colour values) assigned to each pixel of the image. To create A0 we simply add a random number to each colour attribute for each pixel. This random number represents added noise, and in our case it comes from a normal distribution with mean zero (meaning that our technique will increase or decrease a pixel's colour value equally often). We vary the degree of change by controlling the variance of the noise distribution, even though the distribution always has mean zero. Noise added from a high variance distribution will make large but random changes to the image and should create a pair of images with a high discriminability (high d0 value), meaning that the subject can discriminate between

T. J. Polnaszek et al. / Animal Behaviour 129 (2017) 113e125

(a)

(b)

Food cup

Image-projecting monitor / pecking response mechanism

117

Pigeon key Pigeon key

Image stimulus

Maganize light Front perch

Image-projecting monitor / pecking response system Rear perch Food cup Water dish

Figure 3. Skinner box used in experiments 1 and 2. (a) Overhead view of the apparatus. (b) Front panel, with its pecking response keys, from the perspective of the subject.

them easily. Noise added from a low variance distribution will make small changes to the images and should produce a pair of images with a low d0 . Fig. 4 shows an example of an image pair with low variance in the noise added and one with high variance (see Supplementary Material 3 for all image pairs used). Experiment 1 followed a straightforward plan. We systematically varied noise added in creating images pairs, then measured the resulting discriminability (d0 ) in a simple discrimination task. The goal was to characterize the relationship between added noise and d0 , and to therefore identify pairs of stimuli with a range of discriminability values for use in experiment 2.

created pairs of images such that each pair consisted of an original and a manipulated (‘noise-added’) image. We used MatLab (Mathworks, Natick, MA, U.S.A.) to add Gaussian noise to each pixel, with differing levels of variance. High values of variance result in more pixelated, ‘fuzzier’ images. For each of the three original images in the three difficulty categories, we added three levels of variance to give a range of discriminability at each difficulty level (Table 1). This resulted in 27 image pairs for our subjects to discriminate (i.e. 27 ‘treatments’). For each subject in each treatment, we randomly assigned one image type (original or noiseadded) to be the Sþ, and the other to be the S.

Methods

Treatments and treatment order The goal here was to test a large number of image pairings across a wide range of variance in the noise added to establish our ability to manipulate the discriminability of images. As described above, we created 27 pairs of images and tested each subject on all of the image pairs. The image pairs were tested in a random order, with the restriction that each subject was tested on a set of easy, medium and difficult image pairs (randomly ordered) before moving on to a new set. Once a subject completed 600 trials, the subject moved on to the next treatment on the following day.

Subjects The subjects were nine blue jays of unknown sex and mixed experimental histories (band numbers 7, 13, 20, 95, 207, 256, 320, 342 and 345). All subjects had previous experience in similar operant-style foraging experiments and were experienced using pecking keys to register decisions. Generating stimuli We picked nine images of organisms pseudorandomly from the internet, with the restriction that images were matched for aspect ratio and pixel size. We randomly assigned three images to each of three difficulty categories: easy, intermediate and difficult. We then

Organization of trials Within a treatment, trials were organized into blocks of 24. Each block consisted of four forced trials and 20 free trials. In the forced

118

T. J. Polnaszek et al. / Animal Behaviour 129 (2017) 113e125

Figure 4. High variance in the noise added generated image pairs that were easy to discriminate (top row), and low variance generated image pairs that were difficult to discriminate (bottom row). Experiment 1 allowed us to quantify the ease of discrimination of these pairs, from our subjects' perspective. For example, the measured discriminability parameter (d0 ) for these example image pairs is d0 ¼ 2.0 for the easy image pair and d0 ¼ 0.06 for the difficult pair.

Table 1 We randomly assigned three images to each difficulty level (row), manipulated each image by adding Gaussian noise to each pixel (columns) and paired the manipulated images with the originals to create sets of images with different degrees of similarity (and presumably different discriminabilities from our subjects' perspective; experiment 1 tested this idea) Image

Variance Level 1

Level 2

Level 3

Easy Medium Hard

0.8 0.08 0.008

0.5 0.05 0.005

0.2 0.02 0.002

trials, only one pecking key was active, forcing the subject to choose a particular action for the stimulus shown on that trial. Forced trials consisted of an attack S, reject S, attack Sþ and reject Sþ, ensuring that the subjects remained familiar with each contingency throughout the experiment. In each block of free trials there were 10 S and 10 Sþ stimulus presentations. The order of stimulus presentation was randomly determined for each block. Within-trial events After an intertrial interval (ITI) of 60 s, and when the bird was at the rear of the box, the apparatus simultaneously illuminated the large key (the stimulus) and the small key. Responses were registered via key pecking: a peck to the large key indicated an attack of the presented prey stimulus, and a peck to the small key indicated a rejection of the presented prey stimulus. After any response was registered, both keys immediately extinguished. Table 2 outlines

Table 2 Reward earned by subjects for each action  stimulus combination in experiment 1

‘Attack’ key ‘Reject’ key

Sþ stimulus

S stimulus

2 food pellets 0 food pellets

0 food pellets 2 food pellets

the payoff structure for the experiment. Correct attacks and rejects were each rewarded with two pellets. Incorrect attacks and rejects yielded zero pellets. The symmetrical payoff structure, along with the equal occurrence of the Sþ and S signals, should help reduce subject biases and aid our estimation of d0 . That is, under these experimental conditions subjects should equally value a correct accept of the Sþ and a correct reject of the S. Reducing bias is also important because we aim to compare d0 values and ultimately the effects of discriminability on predator behaviour. Such comparisons are more likely valid when bias is minimized, or at least equivalent across stimulus sets (Macmillan & Creelman, 2005). A magazine light positioned directly over the food cup flashed with delivery of each pellet. After the food delivery, the ITI clock restarted. If no response was registered within 15 min of stimulus presentation, the trial was aborted and the ITI clock restarted. Data analysis Trials 400e600 were used in the analysis to measure peak performance on each discrimination task. For each trial, the central computer documented stimulus type (Sþ or S) and subject action (attack or reject), allowing us to construct a contingency table for each treatment. Importantly, recording the probability of hits

T. J. Polnaszek et al. / Animal Behaviour 129 (2017) 113e125

(attacking the Sþ) and false alarms (attacking the S) allowed us to assign an empirically measured d0 value, from the subject's perspective, to each image pair. Associations with the ‘fuzziness’ quality of an image may carry over from one treatment to the next (e.g. a subject predisposed to designate a ‘clear’ image as the Sþ based on previous treatment), which may reduce performance and cause us to underestimate d0 values. However, evaluating subject performance only after 400 trials with the current image set should help counteract any such effects. RESULTS In broad overview, experiment 1 demonstrated that our noise addition technique can manipulate the discriminability of a pair of images in an orderly way. Small amounts of added noise do lead to small measured d0 values and hence to difficult discrimination problems, while large amounts of added noise lead to large d0 values and easy discrimination problems. Fig. 5 plots the logarithm of added noise versus measured d0 . As the figure shows, regression analysis detected a significant nonzero slope (b ¼ 0.795, t25 ¼ 8.473, P < 0.00001). Overview Although we are cautious about making the claim that this relationship is necessarily linear, the key point is that noise addition allows us to manipulate the ease with which well-trained subjects can make a visual discrimination along a continuous scale. This is important for experiment 2, because to test the hypothesis underlying aposematism, that easy discrimination facilitates predator avoidance, we need to be able to create both easy and difficult discrimination problems in a nonanthropocentric way. EXPERIMENT 2: TESTING THE MODEL Experiment 2 used the results of experiment 1 to test the model of aposematic signalling presented in the Introduction. Specifically,

2.5

Discriminability (d')

2

119

experiment 2 assessed the value of increased discriminability for an unprofitable prey item by manipulating d0 and observing the effect of this manipulation on the attack rate experienced by unprofitable prey (i.e. the proportion of false alarms, or P(False Alarm)). Importantly, the experiment utilized virtual prey items with discriminability values based on predator perception (as measured in experiment 1). Fig. 1 shows the predicted relationship between discriminability (d0 ) and P(False Alarm) according to our model. As a reminder, the model predicts that increased discriminability generally, but not always, decreases predation on the unprofitable prey. Predation may increase with increased discriminability when unprofitable prey are prevalent. The model also predicts that the protective value of discriminability depends upon the relative frequency of the unprofitable item (p). Unprofitable prey are also predicted to benefit more from an increase in discriminability when they are uncommon, suggesting distinctiveness is most important for rare aposematic prey. In broad overview, this experiment measured the benefit of being distinctive from profitable prey, as affected by (1) the level of discriminability (d0 ), (2) the environmental uncertainty or prevalence of the bad prey type (p) and (3) the unprofitability of the bad prey type (b). The experiment followed a 3  3  2 factorial design with three levels of d0 , three levels of p and two levels of b. Methods Subjects Nine blue jays of unknown gender and mixed experimental histories (band numbers 7, 13, 20, 95, 207, 256, 320, 342, 345) served as subjects. Experiment 2 utilized the same Skinner boxes and equipment as experiment 1. All of these birds also completed experiment 1, and thus had experience with this specific experimental set-up. Choosing pairs of images We chose six of the 27 image pairs tested in experiment 1, grouped into three discriminability categories (easy, intermediate, difficult), with two images in each category. To achieve a wide range of discriminability in this second experiment, we used image pairs near the maximum of our measured d0 values (d0 near 2), nearly indiscriminable image pairs (d0 near 0) and image pairs with measured d0 between these two endpoints (d0 near 1). Specifically, the easy images we used had measured d0 values of 2.00 and 1.91 (average 1.96), the intermediate images measured 1.18 and 1.09 (average 1.14) and the difficult images measured 0.06 and 0.00 (average 0.03).

1.5

1

0.5

0

–0.5 –2.5

–2 –1.5 –1 Log(variance added)

–0.5

0

Figure 5. Mean discriminability values (d0 ) for all 27 image pairs, with the logarithm of the variance added to the ‘fuzzy’ image on the X axis (values closer to 0 have higher variance added, resulting in an easier discrimination problem).

Treatments and treatment order The experiment followed a 3  3  2 factorial design (18 parameter combinations) in which we tested three levels of d0 (0.03 ¼ very difficult discrimination; 1.14 ¼ intermediate difficulty; 1.96 ¼ easy discrimination), three levels of p (0.1 ¼ bad type rare; 0.5 ¼ good and bad equally common; 0.9 ¼ bad type common) and two levels of b (1 ¼ moderately bad; 2 ¼ very bad). Table 3 outlines the 3  3  2 experimental design. We also tested two image pairs in each discriminability category (easy, intermediate, difficult) to better allow an estimation of the effects of that category, rather than effects of a particular image (see Supplementary Material 3 for images used). Since we tested two images per discriminability category, this created a total of 36 separate treatments. One image pair from each discriminability category was randomly assigned to the first or second half of the experiment. Each subject was then tested on all parameter combinations listed in Table 3, in a random order, using its assigned ‘first half’ image for each discriminability category. In the second

120

T. J. Polnaszek et al. / Animal Behaviour 129 (2017) 113e125

Table 3 Treatment conditions for experiment 2, where each subject experienced two image pairs at each discriminability category Discriminability category

Badness (b)

Proportion of bad prey (p)

Easy (d0 ¼1.96)

1

0.10 0.50 0.90 0.10 0.50 0.90 0.10 0.50 0.90 0.10 0.50 0.90 0.10 0.50 0.90 0.10 0.50 0.90

2

Intermediate (d0 ¼1.14)

1

2

Hard (d0 ¼0.03)

1

2

Response measures (e.g. P(False Alarm)) were averaged within subjects across the two image pairs within a discriminability category.

half of the experiment, treatment order was rerandomized and this process was repeated using the remaining images. In other words, subjects were tested twice on each combination of parameters in Table 3. As in experiment 1, each treatment lasted for 600 trials, after which the subject moved on to the next treatment on the following day. In the analysis, response measures were averaged across the two redundant treatments within a discriminability category. There was one exception: one subject had to be removed halfway through the experiment for health reasons. For this subject, responses were instead taken from the single set of responses generated in the first half of the experiment (one image pair in each discriminability category). Data were analysed using a within-subjects design, and we also confirmed that the data met the assumptions of the repeated measure ANOVA analysis. Organization of trials Trials were organized into blocks of 24. Each block consisted of four forced trials and 20 free trials. Each set of four forced trials contained a forced attack S, reject S, attack Sþ and reject Sþ, ensuring that the subjects remained familiar with each contingency throughout the experiment. For the free trials, the number of S presentations per block changed depending on the treatment. For example, in the p ¼ 0.10 treatments, there were always two S stimuli and 18 Sþ stimuli in each block. Within this restriction on the number of S trials, however, the stimulus types were randomly ordered. Within-trial events After an ITI of 60 s, and when the bird was at the rear of the box, the apparatus simultaneously illuminated the large key (the stimulus) and the small key. Responses were registered via key pecking: a peck to the large key indicated an attack of the presented prey stimulus, and a peck to the small key indicated a rejection of the presented prey stimulus. After any response was registered, both keys immediately extinguished. Table 4 outlines the payoff structure for the experiment. Correct attacks (hits) were rewarded with six pellets. Rejecting the image, correct or incorrect, yielded a constant reward of four pellets. Incorrect attacks, or false alarms, received either no pellets or two pellets, depending on the ‘badness’ of the prey item. A magazine light positioned directly over the food cup flashed with delivery of each pellet. After the food

Table 4 Reward earned by subjects for each action  stimulus combination in experiment 2

‘Attack’ key ‘Reject’ key

Sþ stimulus

S stimulus

6 food pellets 4 food pellets

0 or 2 food pellets (depends on value of b) 4 food pellets

The ‘badness’ variable (b) determined the reward for false alarms (accepting the S stimulus). When b ¼ 2, the subject received no food pellets, and when b ¼ 1, they received two food pellets.

delivery, the ITI clock restarted. If no response was registered within 15 min of stimulus presentation, the trial was aborted and the ITI clock restarted. Results Proportion of false alarms (P(False Alarm)) The proportion of false alarms, P(False Alarm), is conceptually important in the study of aposematism because it represents the proportional risk experienced by the distinctive and defended, unprofitable prey. Fig. 6 shows the observed P(False Alarm) values as a function of d0 (recall that small d0 values represent very difficult discrimination problems) over all parameter combinations. A repeated measures ANOVA of these data showed (1) a significant interaction between the proportion of bad prey (p) and discriminability (d0 ) (F4,32 ¼ 2.86, P ¼ 0.0390), (2) a significant main effect of p (F2,16 ¼ 18.52, P ¼ 0.0001), (3) a significant main effect of d0 (F2,16 ¼ 3.84, P ¼ 0.0435) and (4) a significant main effect of the ‘badness’ of the bad prey (b) (F2,8 ¼ 10.09, P ¼ 0.01). No other interaction effects were significant: b)p interaction (F2,16 ¼ 1.43, P ¼ 0.267), b)d0 interaction (F2,16 ¼ 3.15, P ¼ 0.07) and b)p) d0 (F4,32 ¼ 0.18, P ¼ 0.945). We discuss each of these results briefly below. Interaction: Discriminability (d0 ) by frequency of bad prey (p) An interaction plot of discriminability (d0 ) and the relative frequency of bad prey (p) suggests that this interaction, although significant, was rather subtle (Fig. 7). While P(False Alarm) tended to decrease with increasing d0 , this relationship flattened out as we increased p. This is theoretically important because it means that the increase in protection offered by distinctiveness from profitable prey was greater when the aposematic prey was rare. There is even a suggestion that the relationship takes the predicted form of an inverted U at high p values. In this case, becoming more discriminable from the good prey type (i.e. moving from a difficult discrimination problem to an intermediate one) can result in a higher attack rate on the bad prey. Effect of discriminability (d0 ) In all cases except one (b ¼ 1, p ¼ 0.9), P(False Alarm) decreased with increasing d0 . This is clear support for a central hypothesis of aposematism, that an easily discriminable signal provides protection from predation. The exceptional case here (see Fig. 7) provides an intriguing counter-example, however, because it suggests that increasing distinctiveness may actually increase the risk of predation, for both unprofitable and profitable prey, in some situations (see Holen & Johnstone, 2006; Oaten et al., 1975). Effect of the relative frequency of bad prey (p) Overall, P(False Alarm) decreased as we increased the relative € m et al., 2001; frequency of bad prey (p) (see also Lindstro €m et al., 1997; Pilecki & O'Donald, 1971). An increase in p Lindstro reduces the mean value of the good/bad prey system as a whole. Thus, the increase in p reduces the value of attempting to

T. J. Polnaszek et al. / Animal Behaviour 129 (2017) 113e125

b=2

121

b=1

d' = 0.03 d' = 1.14

P(False Alarm)

0.75

d' = 1.96

0.5

0.25

0 0.1

0.5

0.1 0.9 Proportion bad prey (p)

0.5

0.9

Figure 6. P(False Alarm) for all treatments (median and interquartile range). The X axis shows the proportion of bad prey in the environment (p). Discriminability categories are coded with the shading of the box plot, where darker shading indicates an easier discrimination problem (see legend). The left panel shows responses for a higher badness value (b ¼ 2), and the right panel shows responses for intermediate badness (b ¼ 1).

p = 0.1

p = 0.9

p = 0.5

P(False Alarm)

0.75

0.5

0.25

0 0.03

1.14

1.96

0.03 1.14 1.96 Discriminability (d')

0.03

1.14

1.96

Figure 7. Interaction between the proportion of bad prey (p) and discriminability (d0 ) (median and interquartile range). Each panel shows responses for one of the levels of p. The data here are pooled between the two badness treatments (b ¼ 1 and b ¼ 2). P(False Alarm) is generally a decreasing function of d0 , except when p ¼ 0.9.

discriminate at all, and so increasing p decreases acceptance probabilities for all prey types. Effect of the degree of bad prey unprofitability (b) Overall, P(False Alarm) decreased as we increased the relative unprofitability of the defended prey (b) (see also Alcock, 1970; € m et al., 1997). As with the effect of p, this was largely Lindstro because increasing b decreased the mean value of the good/bad prey system as a whole and so it biased subjects to reject prey more often. Failure of the zero/one predictions: further analysis While many aspects of our results were in qualitative agreement with our predictions, there was one very striking exception. Our model predicted bimodal behaviour at very low discriminability values. This prediction arises because when the predator cannot

discriminate between good and bad types (when d0 is very small), it must either always attack or always reject, which should give attack probabilities of zero or one. We did not observe this. Rather, we observed intermediate attack probabilities for these low d0 cases. An analysis of individual variation suggests that this arises in our data via an averaging process. That is, individuals in the low discriminability treatment did, in fact, tend to show either very high or very low attack rates. Fig. 8 shows a histogram of observed P(False Alarm) values for each of the nine conditions when the relative unprofitability of the bad prey was high (b ¼ 2). The figure shows a transition from a bimodal distribution when d0 was low to a unimodal distribution where d0 was high, in broad qualitative agreement with our model's predictions. Quantitatively, of course, this pattern disagrees with our model in the sense that our model offers no explanation about why individuals should vary in this way. One possibility is that previous experience with a given image

122

T. J. Polnaszek et al. / Animal Behaviour 129 (2017) 113e125

p = 0.1, d’ = 0.03

p = 0.1, d’ = 1.14

p = 0.1, d’ = 1.96

p = 0.5, d’ = 0.03

p = 0.5, d’ = 1.14

p = 0.5, d’ = 1.96

p = 0.9, d’ = 0.03

p = 0.9, d’ = 1.14

p = 0.9, d’ = 1.96

Density

10 5 0 10 5 0 10 5 0

0

0.25

0.5

0.75

1

0

0.25 0.5 0.75 P(False Alarm)

1

0

0.25

0.5

0.75

1

Figure 8. Distribution of the proportion of false alarms when b ¼ 2. Columns represent different discriminability values (d0 ), decreasing in difficulty from left to right. Rows show different proportions of bad prey (p), with bad prey increasing in frequency from the top down.

(from experiment 1 or prior treatments in experiment 2) affected whether a subject chose an ‘accept all’ or ‘reject all’ strategy. Overview Our key empirical finding was that discriminability (d0 ) and the relative frequency of bad prey (p) interact to determine the effect of distinctively signalling unprofitability. P(False Alarm) measures the predation risk experienced by our unprofitable prey item, and we see that while the risk of predation typically decreased with increasing discriminability, the slope of this relationship decreased as the relative frequency of unprofitable prey increased. This means that the marginal advantage (obtained by decreasing predation risk) of increasing discriminability was greater when the unprofitable prey item was rare, and smaller when the unprofitable item was common. That is, distinctiveness is most important to unprofitable prey when they are rare. GENERAL DISCUSSION We found that the probability of attack decreased as distinctiveness increased. This is clear support for a key hypothesis in the explanation of aposematism, that signalling to predators confers an attack avoidance advantage on unprofitable prey when it distinguishes unprofitable prey from profitable prey (Fisher, 1930; Sherratt & Beatty, 2003). Importantly, we also show that the advantage conferred by distinctiveness depends on the relative frequency of the unprofitable prey, which generally agrees with similar results from a study using human ‘predators’ and virtual prey (McGuire et al., 2006). Interestingly, our results also suggest that increasing distinctiveness does not always benefit the unprofitable prey, just as perfect mimicry is not always best for mimetic species (Holen & Johnstone, 2006). Instead, under certain conditions, aposematic prey should tolerate (or even prefer) high similarity to other prey rather than imperfectly distinguishing themselves from these mimics. If we consider the predation avoidance advantage that accrues from increasing distinctiveness by one unit (measured in d0 ), our results show that a rare unprofitable prey type achieves a greater predation avoidance advantage than a common unprofitable prey type. This is primarily driven by the fact that distinguishing yourself from good prey is less important when bad prey predominate. When most prey are bad, a welladjusted predator should avoid both the good and the bad. One way to think about these effects is to realize that there are two distinct routes to predator avoidance. First, an unprofitable prey type can

reduce the average quality of an entire class of prey, such that it pays predators to lump both prey types together and avoid all prey in the broader class. Recent theory emphasizes the importance of the costs of errors and predator ‘lumping’ behaviour for driving patterns of aposematism and mimicry (Kikuchi & Sherratt, 2015). Second, an unprofitable prey type can adopt a strategy that facilities its recognition. When d0 is small, the first route (average quality reduction) is the only possible strategy, and can be achieved by increasing the relative frequency of bad prey or by making bad prey less profitable (or both). When d0 is large, in contrast, the second route (facilitating recognition by increasing d0 still further) may be the better strategy. In addition, our study demonstrates the value of signal detection theory in the analysis of aposematic signals. The tools of signal detection theory, including d0 , are supported by many years of research in psychology and sensory biology (e.g. Commons, Nevin, & Davison, 1991; Lynn & Barrett, 2014; Swets, 1996; Wiley, 2013) and have been used in models of aposematism, more specifically in the mimicry of aposematic prey (e.g. Getty, 1985; Sherratt, 2002). Theoretical literature on aposematism and mimicry clearly demonstrates the utility of a signal detection approach for studying these phenomena. Here we have applied these tools experimentally to test predictions about the value of distinctiveness. To our knowledge, this is the first experimental treatment of aposematism to utilize a quantitative measure of discriminability (d0 ) first derived from observations of predator perception. These methods could be applied to test further the benefits of conspicuousness from the background by measuring the discriminability of background images versus background þ prey. The variable d0 provides a quantitative, nonanthropocentric measurement of discriminability that, with some creativity, could probably be measured and manipulated in many other laboratory and field situations (for a primer on signal detection, see Macmillan & Creelman, 2005; for more on its integration with behavioural ecology, see Lynn, 2005; Wiley, 2013). The approach outlined here also represents an important complement to visual modelling approaches (Kraemer & Adams, 2014; Macedonia et al., 2009; Vorobyev, Brandt, Peitsch, Laughlin, & Menzel, 2001). Whereas visual modelling can provide critical nonanthropocentric information about the perception of predators (or prey), and thus estimate theoretical discrimination ability, the level of observed behavioural discrimination may ultimately depend on other factors (such as prey profitability and prevalence). Ultimately, we should not be satisfied with claims about the value of conspicuousness and distinctiveness that hinge on assessments made by human observers. The

T. J. Polnaszek et al. / Animal Behaviour 129 (2017) 113e125

d0 technique utilized here is derived from direct measurements of predator behaviour, so it readily adapts to differences in predator sensory abilities and makes direct predictions about predator behaviour. Conspicuousness, Distinctiveness and Predator Perception Our experiments focused on the benefits of aposematic signals derived from being distinctive from profitable prey. We acknowledge that these represent only some of the hypothesized benefits of aposematism. Aposematic prey can be distinctive from profitable prey and/or differ conspicuously from the background. Both distinctiveness and conspicuousness likely influence predator behaviour because predators face multiple perceptual challenges when faced with models, mimics and other prey. For example, a foraging predator may be tasked with (1) discriminating profitable prey from the background, (2) discriminating unprofitable prey from the background and (3) discriminating profitable prey from unprofitable prey. Many studies of aposematism (especially those exploring the learning improvement advantage of conspicuousness) focus on the second process (e.g. Gamberale-Stille, 2001; Prudic, Skemp, & Papaj, 2007; Roper, 1990), while we have focused on the third. Each process is ultimately important, and they may interact in determining predator behaviour. For example, processes 1 and 2 of detecting prey against the background, and the comparative simplicity of each, will alter predators' encounter rates with each prey type. In turn, this would change predators' perception of the relative abundance of each prey type (the variable p in our model and experiments). Future comparisons of these perceptual processes, and any additive effects between them, could benefit our understanding of predator cognition. Learning versus Maintaining the Aposematic Signalling System, and Why They Both Matter Why stand out to predators? Increasing detection by predators could result in heavy costs in terms of predation. Thus, many experimental and theoretical treatments of aposematism have focused on how aposematic signals may balance these costs of detection by influencing predator avoidance learning. For example, Gittleman et al.'s (1980) wonderfully titled ‘experiments in bad taste’ showed that enhanced contrast with the background improved the ability of newly hatched chicks to learn a food avoidance task. The tacit assumption in such studies is that predators will eventually learn to avoid unprofitable prey, but that conspicuousness makes this process faster and more efficient. According to this view, the advantages of aposematic signals flow from this expedited predator learning (evidence for such advantages reviewed in Ruxton et al., 2004). In contrast, our approach considers the stable behaviour of an experienced predator, an approach less represented in aposematism literature (but see Barnett, Skelhorn, Bateson, & Rowe, 2012; Halpin, Skelhorn, & Rowe, 2012; Skelhorn & Rowe, 2007) but often present in models of predator behaviour and optimal foraging. The premise of this approach is that all discrimination and decision processes are imperfect, and as a result, even experienced predators make mistakes. According to this view, distinctiveness is advantageous because it reduces the frequency of these mistakes. Both advantages of aposematic signals (improved learning during acquisition and post-learning error avoidance) are probably important, as emphasized by a recent review of aposematism literature that discusses both of these ‘phases’ in turn (Skelhorn, Halpin, & Rowe, 2016). The relative importance of these phenomena, both ecologically and evolutionarily, is an open question worthy of further exploration.

123

Mimicry and Signal Detection Palatable prey sometimes mimic the signals of aposematic species to garner protection from predation, termed Batesian mimicry. Theoretical literature demonstrates that the signal detection approach utilized here is applicable for studying mimicry (Abbott & Sherratt, 2013; Getty, 1985; Holen & Johnstone, 2004, 2006; Oaten et al., 1975). Yet there is a relative lack of empirical applications of signal detection to mimicry and aposematism (one notable exception being McGuire et al., 2006). Our experiments use the signal detection approach to empirically demonstrate the value of distinctiveness for the unprofitable prey, but related experiments could be designed to ask how similar mimetic species must be to avoid predation (i.e. focusing on attack rates on the palatable prey, or p(hit)). As an example, one of the explanations for the maintenance of imperfect mimicry is that predators cannot easily distinguish model and mimic species (Dittrich et al., 1993; Edmunds, 2000). Quantifying discriminability using predator behaviour would provide a clear measurement of the similarity of mimics, and also enable a test of the protection provided by such similarity. Summary With this research, we have developed and experimentally tested a model of aposematism that combines a model of receiver behaviour with signal detection theory. Our approach used a welljustified and nonanthropocentric measure of discriminability, which we used to explore the error reduction advantages of being distinctive from profitable prey (a process distinct from improved avoidance learning). We found that discriminability interacts with the relative abundance of unprofitable prey. This occurred because a common unprofitable prey item can reduce predation by reducing the average prey value, while a rare but unprofitable item needs to exploit the predator's abilities to discriminate profitable prey from unprofitable. It follows that rare unprofitable prey benefit more from an increase in discriminability than common unprofitable prey. Our experiment broadly supports these theoretical predictions even though we did not predict predator behaviour accurately at low discriminability. Acknowledgments Many thanks to the undergraduate research staff that assisted with animal care and data collection. We also thank the referees whose comments improved this manuscript. Funding for the research project was provided by National Science Foundation grant IOS-077221. Supplementary Material Supplementary material associated with this article is available, in the online version at http://dx.doi.org/10.1016/j.anbehav.2017.05. 009. References Abbott, K. R., & Sherratt, T. N. (2013). Optimal sampling and signal detection: Unifying models of attention and speedeaccuracy trade-offs. Behavioral Ecology, 24(3), 605e616. http://dx.doi.org/10.1093/beheco/art001. Alcock, J. (1970). Punishment levels and the response of white-throated sparrows (Zonotrichia albicollis) to three kinds of artificial models and mimics. Animal Behaviour, 18, 733e739. http://dx.doi.org/10.1016/0003-3472(70)90019-9. Alsop, B. (1998). Receiver operating characteristics from nonhuman animals: Some implications and directions for research with humans. Psychonomic Bulletin & Review, 5(2), 239e252. http://dx.doi.org/10.3758/BF03212946.

124

T. J. Polnaszek et al. / Animal Behaviour 129 (2017) 113e125

Arenas, L. M., Walter, D., & Stevens, M. (2015). Signal honesty and predation risk among a closely related group of aposematic species. Scientific Reports, 5, 11021. http://dx.doi.org/10.1038/srep11021. Barnett, C. A., Bateson, M., & Rowe, C. (2014). Better the devil you know: Avian predators find variation in prey toxicity aversive. Biology Letters, 10(11), 20140533. http://dx.doi.org/10.1098/rsbl.2014.0533. Barnett, J. B., Scott-Samuel, N. E., & Cuthill, I. C. (2016). Aposematism: Balancing salience and camouflage. Biology Letters, 12(8), 20160335. http://dx.doi.org/ 10.1098/rsbl.2016.0335. Barnett, C. A., Skelhorn, J., Bateson, M., & Rowe, C. (2012). Educated predators make strategic decisions to eat defended prey according to their toxin content. Behavioral Ecology, 23(2), 418e424. http://dx.doi.org/10.1093/beheco/arr206. Blough, D. S. (2001). Some contributions of signal detection theory to the analysis of stimulus control in animals. Behavioural Processes, 54(1e3), 127e136. http:// dx.doi.org/10.1016/S0376-6357(01)00154-1. Caley, M., & Schluter, D. (2003). Predators favour mimicry in a tropical reef fish. Proceedings of the Royal Society B: Biological Sciences, 270(1516), 667e672. http://dx.doi.org/10.1098/rspb.2002.2263. Commons, M. L., Nevin, J. A., & Davison, M. C. (1991). Signal detection: Mechanisms, models, and applications. Hillsdale, NJ: L. Erlbaum. Dittrich, W., Gilbert, F., Green, P., Mcgregor, P., & Grewcock, D. (1993). Imperfect mimicry: A pigeon's perspective. Proceedings of the Royal Society B: Biological Sciences, 251(1332), 195e200. http://dx.doi.org/10.1098/rspb.1993.0029. Duncan, C. J., & Sheppard, P. M. (1965). Sensory discrimination and its role in the evolution of Batesian mimicry. Behaviour, 24(3/4), 269e282. http://dx.doi.org/ 10.2307/4533110. Edmunds, M. (2000). Why are there good and poor mimics? Biological Journal of the Linnean Society, 70(3), 459e466. http://dx.doi.org/10.1111/j.10958312.2000.tb01234.x. Egan, J. (1975). Signal detection theory and ROC analysis. London, U.K.: Academic Press. Fisher, R. A. (1930). The genetical theory of natural selection. Oxford, U.K.: Oxford University Press. Gamberale, G., & Tullberg, B. S. (1998). Aposematism and gregariousness: The combined effect of group size and coloration on signal repellence. Proceedings of the Royal Society B: Biological Sciences, 265(1399), 889e894. http://dx.doi.org/ 10.1098/rspb.1998.0374. Gamberale-Stille, G. (2001). Benefit by contrast: An experiment with live aposematic prey. Behavioral Ecology, 12(6), 768e772. http://dx.doi.org/10.1093/ beheco/12.6.768. Gescheider, G. A. (1997). Psychophysics: The fundamentals. Mahwah, NJ: L. Erlbaum. Getty, T. (1985). Discriminability and the sigmoid functional response: How optimal foragers could stabilize modelemimic complexes. American Naturalist, 125(2), 239e256. Getty, T., Kamil, A. C., & Real, P. G. (1987). Signal detection theory and foraging for cryptic or mimetic prey. In A. C. Kamil, J. R. Krebs, & H. R. Pulliam (Eds.), Foraging behavior (pp. 525e548). New York, NY: Plenum. http://dx.doi.org/10.1007/9781-4613-1839-2_18. Retrieved from:. Gittleman, J. L., & Harvey, P. H. (1980). Why are distasteful prey not cryptic? Nature, 286(5769), 149e150. http://dx.doi.org/10.1038/286149a0. Gittleman, J. L., Harvey, P. H., & Greenwood, P. J. (1980). The evolution of conspicuous coloration: Some experiments in bad taste. Animal Behaviour, 28, 897e899. http://dx.doi.org/10.1016/S0003-3472(80)80150-3. Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York, NY: Wiley. Green, P. R., Gentle, L., Peake, T. M., Scudamore, R. E., McGregor, P. K., Gilbert, F., et al. (1999). Conditioning pigeons to discriminate naturally lit insect specimens. Behavioural Processes, 46(1), 97e102. http://dx.doi.org/10.1016/S0376-6357(99) 00022-4. Guilford, T. (1988). The evolution of conspicuous coloration. American Naturalist, 131(Suppl.), S7eS21. Halpin, C. G., Skelhorn, J., & Rowe, C. (2012). The relationship between sympatric defended species depends upon predators' discriminatory behaviour. PLoS One, 7(9), e44895. http://dx.doi.org/10.1371/journal.pone.0044895. €m, L., & Mappes, J. (2006). Does colour matter? The Ham, A. D., Ihalainen, E., Lindstro importance of colour in avoidance learning, memorability and generalisation. Behavioral Ecology and Sociobiology, 60(4), 482e491. http://dx.doi.org/10.1007/ s00265-006-0190-4. Holen, Ø. H., & Johnstone, R. A. (2004). The evolution of mimicry under constraints. American Naturalist, 164(5), 598e613. http://dx.doi.org/10.1086/424972. Holen, Ø. H., & Johnstone, R. A. (2006). Context-dependent discrimination and the evolution of mimicry. American Naturalist, 167(3), 377e389. http://dx.doi.org/ 10.1086/499567. Kikuchi, D. W., & Sherratt, T. N. (2015). Costs of learning and the evolution of mimetic signals. American Naturalist, 186(3), 321e332. http://dx.doi.org/ 10.1086/682371. Kraemer, A. C., & Adams, D. C. (2014). Predator perception of Batesian mimicry and conspicuousness in a salamander. Evolution, 68(4), 1197e1206. http:// dx.doi.org/10.1111/evo.12325. Leimar, O., Enquist, M., & Sillen-Tullberg, B. (1986). Evolutionary stability of aposematic coloration and prey unprofitability: A theoretical analysis. American Naturalist, 128(4), 469e490. €m, L., Alatalo, R. V., Lyytinen, A., & Mappes, J. (2001). Strong antiapostatic Lindstro selection against novel rare aposematic prey. Proceedings of the National

Academy of Sciences of the United States of America, 98(16), 9181e9184. http:// dx.doi.org/10.1073/pnas.161071598. €m, L., Alatalo, R. V., & Mappes, J. (1997). Imperfect Batesian mimicry: The Lindstro effects of the frequency and the distastefulness of the model. Proceedings of the Royal Society B: Biological Sciences, 264(1379), 149e153. http://dx.doi.org/ 10.1098/rspb.1997.0022. Lynn, S. K. (2005). Learning to avoid aposematic prey. Animal Behaviour, 70, 1221e1226. http://dx.doi.org/10.1016/j.anbehav.2005.03.010. Lynn, S. K., & Barrett, L. F. (2014). ‘Utilizing’ signal detection theory. Psychological Science, 25(9), 1663e1673. http://dx.doi.org/10.1177/0956797614541991. Lynn, S. K., Cnaani, J., & Papaj, D. R. (2005). Peak shift discrimination learning as a mechanism of signal evolution. Evolution, 59(6), 1300e1305. http://dx.doi.org/ 10.1111/j.0014-3820.2005.tb01780.x. Macedonia, J. M., Lappin, A. K., Loew, E. R., Mcguire, J. A., Hamilton, P. S., Plasman, M., et al. (2009). Conspicuousness of Dickerson's collared lizard (Crotaphytus dickersonae) through the eyes of conspecifics and predators. Biological Journal of the Linnean Society, 97(4), 749e765. http://dx.doi.org/10.1111/ j.1095-8312.2009.01217.x. Macmillan, N. A., & Creelman, C. D. (2005). Detection theory: A user's guide. Hillsdale, NJ: L. Erlbaum. Mappes, J., Marples, N., & Endler, J. A. (2005). The complex business of survival by aposematism. Trends in Ecology & Evolution, 20(11), 598e603. http://dx.doi.org/ 10.1016/j.tree.2005.07.011. McGuire, L., Van Gossum, H., Beirinckx, K., & Sherratt, T. N. (2006). An empirical test of signal detection theory as it applies to Batesian mimicry. Behavioural Processes, 73(3), 299e307. http://dx.doi.org/10.1016/j.beproc.2006.07.004. McLinn, C. M., & Stephens, D. W. (2006). What makes information valuable: Signal reliability and environmental uncertainty. Animal Behaviour, 71, 1119e1129. http://dx.doi.org/10.1016/j.anbehav.2005.09.006. Merilaita, S., & Ruxton, G. D. (2007). Aposematic signals and the relationship between conspicuousness and distinctiveness. Journal of Theoretical Biology, 245(2), 268e277. http://dx.doi.org/10.1016/j.jtbi.2006.10.022. Oaten, A., Pearce, C. E. M., & Smyth, M. E. B. (1975). Batesian mimicry and signal detection theory. Bulletin of Mathematical Biology, 37(4), 367e387. http:// dx.doi.org/10.1007/BF02459520. Pilecki, C., & O'Donald, P. (1971). The effects of predation on artificial mimetic polymorphisms with perfect and imperfect mimics at varying frequencies. Evolution, 25(2), 365e370. http://dx.doi.org/10.2307/2406928. Polnaszek, T. J., & Stephens, D. W. (2014). Receiver tolerance for imperfect signal reliability: Results from experimental signalling games. Animal Behaviour, 94, 1e8. http://dx.doi.org/10.1016/j.anbehav.2014.05.011. Prudic, K. L., Skemp, A. K., & Papaj, D. R. (2007). Aposematic coloration, luminance contrast, and the benefits of conspicuousness. Behavioral Ecology, 18(1), 41e46. http://dx.doi.org/10.1093/beheco/arl046. Roper, T. J. (1990). Responses of domestic chicks to artificially coloured insect prey: Effects of previous experience and background colour. Animal Behaviour, 39, 466e473. http://dx.doi.org/10.1016/S0003-3472(05)80410-5. Roper, T. J., & Redston, S. (1987). Conspicuousness of distasteful prey affects the strength and durability of one-trial avoidance learning. Animal Behaviour, 35, 739e747. http://dx.doi.org/10.1016/S0003-3472(87)80110-0. Roper, T. J., & Wistow, R. (1986). Aposematic colouration and avoidance learning in chicks. Quarterly Journal of Experimental Psychology B, 38(2), 141e149. http:// dx.doi.org/10.1080/14640748608402225. Rowland, H. M., Ruxton, G. D., & Skelhorn, J. (2013). Bitter taste enhances predatory biases against aggregations of prey with warning coloration. Behavioral Ecology, 24(4), 942e948. http://dx.doi.org/10.1093/beheco/art013. Ruxton, G. D., Sherratt, T. N., & Speed, M. P. (2004). Avoiding attack: The evolutionary ecology of crypsis, warning signals, and mimicry. Oxford, U.K.: Oxford University Press. Schmidt, R. (1958). Behavioural evidence on the evolution of Batesian mimicry. Animal Behaviour, 6(3e4), 129e138. http://dx.doi.org/10.1016/0003-3472(58) 90042-3. Sherratt, T. N. (2002). The evolution of imperfect mimicry. Behavioral Ecology, 13(6), 821e826. http://dx.doi.org/10.1093/beheco/13.6.821. Sherratt, T. N., & Beatty, C. D. (2003). The evolution of warning signals as reliable indicators of prey defense. American Naturalist, 162(4), 377e389. http:// dx.doi.org/10.1086/378047. Sherratt, T. N., & Franks, D. W. (2005). Do unprofitable prey evolve traits that profitable prey find difficult to exploit? Proceedings of the Royal Society B: Biological Sciences, 272(1579), 2441e2447. http://dx.doi.org/10.1098/rspb.2005.3229. Skelhorn, J., Halpin, C. G., & Rowe, C. (2016). Learning about aposematic prey. Behavioral Ecology, 27(4), 955e964. http://dx.doi.org/10.1093/beheco/arw009. Skelhorn, J., & Rowe, C. (2007). Predators' toxin burdens influence their strategic decisions to eat toxic prey. Current Biology, 17(17), 1479e1483. http://dx.doi.org/ 10.1016/j.cub.2007.07.064. Speed, M. P., & Ruxton, G. D. (2005). Aposematism: What should our starting point be? Proceedings of the Royal Society B: Biological Sciences, 272(1561), 431e438. http://dx.doi.org/10.1098/rspb.2004.2968. Speed, M. P., & Turner, J. R. G. (1999). Learning and memory in mimicry: II. Do we understand the mimicry spectrum? Biological Journal of the Linnean Society, 67(3), 281e312. http://dx.doi.org/10.1111/j.1095-8312.1999.tb01935.x. Stephens, D. W. (2007). Models of information use. In D. W. Stephens, J. S. Brown, & R. C. Ydenberg (Eds.), Foraging: Behavior and ecology (pp. 31e58). Chicago, IL: University of Chicago Press.

T. J. Polnaszek et al. / Animal Behaviour 129 (2017) 113e125 Swets, J. A. (1996). Signal detection theory and ROC analysis in psychology and diagnostics: Collected papers. Hillsdale, NJ: L. Erlbaum. Vorobyev, M., Brandt, R., Peitsch, D., Laughlin, S. B., & Menzel, R. (2001). Colour thresholds and receptor noise: Behaviour and physiology compared. Vision Research, 41(5), 639e653. http://dx.doi.org/10.1016/ S0042-6989(00)00288-1.

125

Wiley, R. H. (1994). Errors, exaggeration, and deception in animal communication. In L. Real (Ed.), Behavioral mechanisms in evolutionary ecology (pp. 157e189). Chicago, IL: University of Chicago Press. Wiley, R. H. (2013). Signal detection, noise, and the evolution of communication. In H. Brumm (Ed.), Animal communication and noise (pp. 7e30). Berlin, Germany: Springer. http://dx.doi.org/10.1007/978-3-642-41494-7_2. Retrieved from:.