Capuchin monkeys (sometimes) go when they know: Confidence movements in Sapajus apella

Capuchin monkeys (sometimes) go when they know: Confidence movements in Sapajus apella

Cognition 199 (2020) 104237 Contents lists available at ScienceDirect Cognition journal homepage: www.elsevier.com/locate/cognit Capuchin monkeys (...

983KB Sizes 0 Downloads 28 Views

Cognition 199 (2020) 104237

Contents lists available at ScienceDirect

Cognition journal homepage: www.elsevier.com/locate/cognit

Capuchin monkeys (sometimes) go when they know: Confidence movements in Sapajus apella

T



Travis R. Smitha, , Audrey E. Parrishb, Courtney Creamerc, Mattea Rossettiec, Michael J. Beranc a

Kansas State University, United States of America The Citadel, United States of America c Georgia State University, United States of America b

A R T I C LE I N FO

A B S T R A C T

Keywords: Metacognition Capuchin monkeys Confidence movement Uncertainty monitoring

To test for evidence of metacognition in capuchin monkeys (Sapajus apella), we analyzed confidence movements using a paradigm adapted from research with chimpanzees. Capuchin monkeys provide an interesting model species for the comparative assessment of metacognition as they show limited evidence of such cognitivemonitoring processes in a variety of metacognition paradigms. Here, monkeys were presented with a computerized delayed matching-to-sample (DMTS) memory test in one location but were rewarded for correct responses in a separate location. Movements could be made from one location to the other at any time, but movements between a response and reward feedback may reflect confidence in the accuracy of the response. Critically, DMTS tests included occasional “no sample” trials where monkeys' performance was at chance when the trial started without a sample and a 1-s interval to the response options. We predicted that monkeys would (1) perform less accurately (and less confidently) at longer retention intervals, (2) move to the dispenser early more often on trials completed correctly than incorrectly, and (3) show a relation between faster response latency and early movements. Analyses of response times and “go” or “no go” confidence movements before feedback to the reward location suggested that the monkeys were capable of monitoring confidence in their responses. However, their confidence movements were less precise and less flexible than chimpanzees. Overall, this paradigm can reveal potential metacognitive abilities in nonhuman animals that otherwise demonstrate these abilities inconsistently.

1. Introduction For more than two decades, comparative psychologists and others studying cognitive processes in animals have focused on a particular aspect of cognition that is not about primary perceptual mechanisms, conceptual classifications, or memory processes, but instead is focused on the possibility that animals are aware of their knowledge states and can adjust behavior on the basis of this awareness. This is called metacognition (Dunlosky & Bjork, 2008; Flavell, 1979; Metcalfe & Shimamura, 1994; Nelson, 1992; Nelson & Narens, 1990; Schwartz, 2009). Metacognition is defined in various ways, sometimes muddying the waters of interpretation of animals' behavior in so-called metacognitive tasks. However, a growing literature suggests that, at minimum, some species (i.e., rhesus monkeys, orangutans, gorillas, rats) do learn to avoid difficult perceptual classification trials (perhaps because they are aware of the difficulty they face) by escaping those trials (e.g., Smith et al., 1995; Smith, Beran, Redford, & Washburn, 2006; Smith,



Coutinho, Church, & Beran, 2013; Smith, Redford, Beran, & Washburn, 2010; Smith, Shields, Schull, & Washburn, 1997; Suda-King, 2008; Suda-King, Bania, Stromberg, & Subiaul, 2013; Yuki & Okanoya, 2017). Some species (i.e., rhesus monkeys, baboons, pigeons, rats) re-study information they are likely to have forgotten prior to an impending memory test or avoid taking the test altogether for exactly those stimuli that are most difficult to remember (e.g., Adams & Santi, 2011; Basile, Schroeder, Brown, Templer, & Hampton, 2015; Brown, Basile, Templer, & Hampton, 2019; Brown, Templer, & Hampton, 2017; Foote & Crystal, 2012; Hampton, 2001; Malassis, Gheusi, & Fagot, 2015; Templer & Hampton, 2012; Templer, Lee, & Preston, 2017). Some species (i.e., rhesus monkeys, chimpanzees, orangutans, dogs, rats, capuchin monkeys) seek additional information before choosing a response option when they cannot possibly know the correct response without that information (e.g., Belger & Bräuer, 2018; Beran & Smith, 2011; Beran, Smith, & Perdue, 2013; Call, 2010; Call & Carpenter, 2001; Hampton, Zivin, & Murray, 2004; Kirk, McMillan, & Roberts, 2014; Marsh &

Corresponding author at: Kansas State University, 492 Bluemont Hall, 1114 Mid-Campus Dr North, Manhattan, KS 66506-5302, United States of America. E-mail address: [email protected] (T.R. Smith).

https://doi.org/10.1016/j.cognition.2020.104237 Received 20 August 2019; Received in revised form 9 February 2020; Accepted 11 February 2020 0010-0277/ © 2020 Elsevier B.V. All rights reserved.

Cognition 199 (2020) 104237

T.R. Smith, et al.

However, removing reinforcement from the initial choice to respond or escape and reserving it for a subsequent unrelated task still did not evoke uncertainty responses to difficult or impossible to perform trials in this species (Perdue, Church, Smith, & Beran, 2015), and this did suggest a real limitation for engaging in metacognition in this species. Given the variable nature of the existing data, we are left at present with the curious case of the capuchin monkey (Smith, Beran, Couchman, Coutinho, & Boomer, 2009; Smith, Smith, & Beran, 2018), and why either (1) they are not capable of metacognitive responding, or (2) they are reluctant demonstrators of a psychological metacognitive state that they might experience. This reluctance may seem confusing given that some non-primate species have demonstrated strong metacognitive-like patterns. For example, evidence from rats shows a variety of sophisticated behavioral patterns in tasks designed to assess metacognition (e.g., Crystal & Foote, 2009; Foote & Crystal, 2007, 2012; Kirk et al., 2014; Templer et al., 2017; Yuki & Okanoya, 2017). However, the comparative literature also includes at least one other species that could be classified as either a non-metacognitive species or a reluctant metacognitive responder. Pigeons show the same inconsistency and variability of performance that capuchin monkeys have shown, with positive reports (Adams & Santi, 2011; Castro & Wasserman, 2013; Nakamura, Watanabe, Betsuyaku, & Fujita, 2011) matched by partial or full failures (Inman & Shettleworth, 1999; Iwasaki, Watanabe, & Fujita, 2013; Roberts et al., 2009; Sutton & Shettleworth, 2008) and debate about the proper interpretation of the empirical data (e.g., Smith, 2009; Sole, Shettleworth, & Bennett, 2003; Zentall & Stagner, 2010). And, it is important to note that we have barely begun to assess the broadness or depth of animal metacognition, with such restricted numbers of primate and non-primate species tested.1 Recently, we developed a new measure of metacognition that involves the use of so-called confidence movements. The original task was presented to chimpanzees (Beran et al., 2015), and it involved requiring the subjects to work on a primary task in one location and receive a reward (if earned) in another location. The basic procedure is to have the subject complete a computer task of varying difficulty across trials (e.g., a memory test with variable retention delays) and, after a response is made, to insert a delay before any feedback is given by the computer. During that delay, the subject can move (or not) towards a distant location where reward is dispensed. After the delay, feedback is given about the outcome of the trial, and then after another short delay, a food reward is dispensed (if earned). The key manipulation in this approach is that the reward is forfeited if not obtained at the moment it is dispensed. In other words, if the subject has not previously moved to the reward site before it is delivered, the reward is lost. One must be at the dispenser in time, and if one waits for feedback from the computer, the time left to travel to the dispenser makes it effortful and somewhat risky that the reward might be lost. Moving early to the reward dispenser, during the delay before feedback, affords plenty of time to be in place for a reward, if one ultimately is dispensed. Thus, if one has high confidence in being correct, moving early is optimal. Low confidence might evoke early moves also, but at the potential cost of wasted movement if no reward is delivered because the response was incorrect. Failures to move early to the reward dispenser when correct mean that the subject must move very quickly and effortfully to still obtain the reward, but if the trial is incorrect, the next trial can be started immediately at little or no movement cost. Thus, the optimal pattern is to go early to the reward dispenser when confident about the outcome of the trial, and this should generate a pattern of more early moves on

MacDonald, 2012a, 2012b; Vining & Marsh, 2015). Furthermore, some species (i.e., rhesus monkeys) show that they can provide confidencelike ratings to their own responses to tests that align with objective performance levels, including those that are made prospectively or retrospectively (e.g., Morgan, Kornell, Kornblum, & Terrace, 2014; Shields, Smith, Guttmannova, & Washburn, 2005). And, some species (i.e., rhesus monkeys, western scrub-jays) calibrate their escape responses to avoid having to complete difficult trials or their hint-seeking behaviors based on the rewards at risk and their likelihood of responding correctly (e.g., Kornell, Son, & Terrace, 2007; Watanabe & Clayton, 2016; Zakrzewski, Perdue, Beran, Church, & Smith, 2014). This growing literature continues to generate strong debate about the proper interpretation of these kinds of behaviors (Beran, Brandl, Perner, & Proust, 2012; Carruthers, 2008, 2009; Comstock & Bauer, 2018; Crystal, 2014; Crystal & Foote, 2009; Hampton, 2009; Kornell, 2009, 2014; Le Pelley, 2012; Smith, 2009; Smith, Couchman, & Beran, 2012; Smith, Zakrzewski, & Church, 2016), but there is consensus that the empirical database continues to grow in terms of positive reports of success in these kinds of tasks. Some of the strongest evidence has emerged for metacognitive monitoring within nonhuman primates, as evidenced by the many citations to papers with the great apes and monkeys in the previous paragraph. However, the breadth of assessment within the order Primates remains fairly limited, with only two of four great apes (chimpanzees and orangutans), two Old World monkey species (rhesus macaques and baboons), and one New World monkey species (capuchin monkeys) having been tested. Despite the limited variability among species tested thus far, there is an intriguing pattern of results that has emerged – capuchin monkeys only sometimes perform consistently with the great apes and Old World monkeys. The earliest assessments with capuchin monkeys focused on information-seeking tasks (Basile, Hampton, Suomi, & Murray, 2009; Paukner, Anderson, & Fujita, 2006; Vining & Marsh, 2015), in which capuchin monkeys sometimes searched an opaque container for food when they should (e.g., when they did not see the food baited) but also often searched when they should not have needed to do so (e.g., the container was transparent). In memory monitoring tasks, capuchin monkeys also failed to show the metacognitive-like patterns of responding shown by rhesus macaques (Fujita, 2009). Beran, Smith, Coutinho, Couchman, and Boomer (2009) presented capuchin monkeys with an absolute classification task and trained monkeys to classify stimuli as a function of the degree of pixilation that made up those stimuli (sparse vs. dense). For the metacognitive response, monkeys were presented with the option to skip to the next trial free of punishment (which was a time-out) but also at the cost of losing any positive reinforcement (food reward). Unlike rhesus monkeys and humans (e.g., Smith et al., 1997), capuchin monkeys did not adopt the metacognitive response, failing to selectively escape difficult trials. The previous psychophysical task in which monkeys were required to classify stimuli as ‘sparse’ or ‘dense’ (Beran et al., 2009) was then used in a series of additional studies to determine what might account for the seeming “divide” among the primate species tested. First, it was demonstrated that capuchin monkeys may be more inclined to make primary responses (those reinforced or punished based on correctness of the response) when the number of choice options was small, and thus the chance of succeeding was higher by chance alone. When tasks were adjusted to make guessing less profitable, capuchin monkeys then (sometimes) added escape responses to their behavioral repertoire and did so for exactly those trials that they should escape if difficulty was their guide (Beran, Perdue, Church, & Smith, 2016; Beran, Perdue, & Smith, 2014). This suggested that when there were only two stimuli to choose between (and guessing was 50% correct), the capuchin monkeys may have preferred a risky guess strategy over the safer trial-escape strategy, especially when the escape strategy did not benefit them with positive reinforcement. This may not reflect a lack of metacognitive capacity but rather a higher threshold for engaging in such monitoring.

1 And, there is also the concern that there might be studies with other species, or previously tested species, that failed to show a pattern suggestive of metacognition, and those studies may not be reflected in our literature. This “file drawer” problem can sometimes plague assumptions about the depth or broadness of (meta)cognitive processes in nonhuman species when there are so few data points in terms of reported species that have been tested.

2

Cognition 199 (2020) 104237

T.R. Smith, et al.

yield a high rate of reinforcement may move to the dispenser erroneously following a no-sample trial rather than moving to the dispenser as a function of memory strength for the stimulus (see Hampton, 2001, for a discussion of no-sample trials in metacognitive testing). However, a metacognitive subject that instead is aware of their performance would not expect to be reinforced on these no-sample trials. Subsequently, performance on no-sample trials should show low confidence in terms of limited or no movement to the reward dispenser. We predicted that capuchin monkeys would show a systematic decrease in accuracy across the increasing retention intervals on the DMTS memory test, aligned closely with the varying delay intervals (i.e., the longer the delay, the poorer the performance). We predicted a greater likelihood of making a confidence movement on correctly completed trials, but we also predicted a relationship between latency in making the memory task response and confidence movements. Specifically, if monkeys remembered a sample or recognized one stimulus as more familiar,2 they should respond more quickly to the memory test than if they did not, and those faster responses to complete the memory test should be predictive of confidence movement on those trials. This also could show an interesting outcome. Some trials could be correctly completed, but because of monkeys “getting lucky” after randomly responding (and perhaps taking longer to do that). In those cases, response latencies to the memory test should be slower compared to correctly completed trials for which monkeys had a strong memory trace or sense of familiarity for the sample. This outcome would show that response latency was dissociated from performance but was predictive of confidence movements in a way that reflects confidence in some correctly completed trials more than in other correctly completed trials.

correct trials than on incorrect trials. Chimpanzees show exactly this pattern, across a series of tests (Beran et al., 2015). To date, they are the only nonhuman species to be given this task. There is reason to predict that capuchin monkeys might show a pattern similar to that of chimpanzees, even if in general they have a tendency to want to respond rather than escape or avoid primary responses in metacognitive tasks. In the present task, subjects must always make the primary response to the computer task (completion of the memory trial). However, the metacognitive response lies in the behavior that follows the completion of the trial – movement (or not) towards the food dispenser. It is true that one might predict overconfidence being aligned with a higher level of risk tolerance (i.e., monkeys move towards the reward dispenser prior to computer feedback regardless of performance), but even in that case, monkeys might come to learn that early movements are differentially reinforced on the basis of primary task performance. And, when confidence is low, remaining at the computer would allow for quicker re-engagement of the computer task. Thus, we modified the task of Beran et al. (2015) so that it could be used with capuchin monkeys, in an effort to determine whether they also might show a pattern of early movements that matched objective performance. We also anticipated an outcome more aligned with a metacognitive pattern given that this task uses a more ecologically relevant response of movement, rather than a more artificial response that is constrained to the computer task itself. Note that we are not arguing that the current approach is ideally suited to capuchin monkeys alone. The work with chimpanzees indicates it is likely valuable with any species that moves towards food, and that makes judgements about likelihood of food availability. The point here is that capuchin monkeys are, at best, “reluctant” to engage in metacognitivelike patterns of responding in previously-used tasks, and so this task may provide more “scaffolding” for metacognitive patterns to emerge. And, this task engages natural responses such as moving to food as a function of whether there is the expectation of food in that location in order to gauge potential metacognitive abilities. Monkeys completed a delayed matching-to-sample (DMTS) memory test with variable retention delays and were rewarded at a distant location away from the computer. For the MTS trials, we recorded trial outcome (correct/incorrect) and latency to make a response as a function of retention interval (1, 2, 4, 7, and 10-s). We recorded their movement patterns throughout trials, to measure not only confidence but also their attentiveness to trials in the primary memory task. We also presented monkeys with a crucial trial type – the no-sample trial. In tests of metamemory in nonhuman animals, the no-sample trial is important to include because it offers a short delay duration but no possible way for the animal to recognize or know the matching stimulus. For example, subjects are not provided with a sample image prior to presentation of the match options in this trial type; thus, they must guess which match option is the correct answer. This test controls for the possibility that a subject learned that shorter retention intervals lead to reinforcement. A subject that has learned that short intervals

2. Method 2.1. Subjects Seven adult capuchin monkeys (Sapajus apella) between the ages of 13 and 20 initially started on this task (Table 1). Monkeys Lily and Wren both failed to reliably cooperate with the testing setup and often disrupted the functioning of the apparatus; thus, their data are not included. Monkey Logan participated in the first phase of the task but did not complete the second phase for reasons unrelated to the study (i.e., social group changes led to Logan no longer voluntarily choosing to separate for testing); therefore, only data from his first phase are included. Monkeys Griffin, Nkima, Nala, and Liam completed both phases of the task. All monkeys were socially housed and voluntarily entered the testing area for peanut rewards. Following each session, regardless of performance, they were provided fresh fruit, vegetables, nuts, and processed monkey chow. They had access to water ad libitum. All monkeys had prior experience working on a computerized test system (e.g., Evans, Beran, Chan, Klein, & Menzel, 2008) that included a history of completing DMTS trials with variable delays. Monkeys typically worked inside of a test box (45.88 × 34.93 × 60.33 cm) designed to house a single monkey during an experimental session in contrast to the larger testing box that was used in the current study (see below for details). Also, all previous studies included reward delivery at the computer location and thus did not require the monkeys to move through space to earn their reward as in the current experimental setup.

Table 1 Subject information including sex, age, total trials completed, percentage of trials where the monkey remained in the middle of the cage between a DMST response and pellet release (Zone 2), and percentage of trials where the monkeys moved prematurely during the retention interval. Lily and Wren did not complete training to participate in experimental phases 1 and 2. Name

Sex

Age

Total trials

% Zone 2

% premature

Griffin Liam Lily Logan Nala Nkima Wren

Male Male Female Male Female Male Female

20 14 20 13 16 10 15

680 674 0 500 751 745 0

2% 26% N/A 5% 52% 6% N/A

57% 12% N/A 30% 24% 23% N/A

2 For these tests, we did not assess working memory, so there is no reason to assume rehearsal of the sample. Because samples were randomly chosen on each trial from a large set of exemplars, monkeys could rely on greater familiarity of one stimulus over the others at the response stage. Thus, the assessment would be more accurately called a measure of short-term memory or familiarity judgment, but even in those cases, there should be an effect of retention interval that allows for variability of outcomes and (potentially) differential degrees of confidence in responses by the monkeys.

3

Cognition 199 (2020) 104237

T.R. Smith, et al.

Fig. 1. Photograph illustrating the apparatus setup (joystick and computer in zone 1, divider in zone 2, and pellet dispenser in zone 3) and a monkey (Nkima in this photo) from the point of view from the video camera. The three zones correspond with the zones used to code the confidence movement data.

dispensed reward, otherwise the monkeys would need to rush to the dispenser and likely risked missing the pellet altogether. Thus, moving to the dispenser in zone 3 following trial completion would ensure that they could catch the food pellet. The requirements for earning a pellet varied across experimental phases.

This study followed the Georgia State University Institutional Animal Care and Use Committee (IACUC protocol A19042) protocols, the guidelines for United States Department of Agriculture Animal Welfare Act, and the “Guidelines for the Use of Laboratory Animals.” Georgia State University is accredited by the Association for Assessment and Accreditation of Laboratory Animal Care (AAALAC).

2.3. Design and procedure

2.3.1. Training Phase 1 The monkeys were not familiar with this experimental arrangement in which they were required to move through space for the reward; therefore, two training phases were included to help transition them to working in the testing conditions. Training Phase 1 did not include a mid-cage divider in zone 2. This was done to encourage the monkeys to tolerate working on a task that required them to move to another area to collect earned pellets. A battery of simple tasks was used in this phase, and these tasks were familiar to the monkeys because they were the original tasks used to train the monkeys to use the joystick for computerized testing (and routinely provided to the monkeys as an enrichment task when research tasks were not being conducted). Specifically, these tasks were SIDE, CHASE, and MTS, and are described in Evans et al. (2008). Briefly, SIDE only required the monkeys to move a cursor to contact a green shaded zone on the screen to earn pellets. CHASE had circles moving around the screen and deflecting off of the boundaries (sides) of the monitor screen and moving the cursor to contact the circles resulted in a pellet. MTS was a two-choice matchingto-sample task with no delay. The monkeys experienced trials of SIDE and CHASE prior to moving onto MTS. MTS continued until 500 trials were completed. Phase 1 continued until the monkeys completed at least one session with 100 or more MTS trials. In these trials, the monkeys had to demonstrate that they would work at the joystick and move to the dispenser to collect pellets. There was no assessment of performance in relation to early moves because these were all easy tasks that rarely involved errors.

To start a trial, the monkey needed to manipulate the joystick to move a cursor on the monitor screen, which would register a response. If correct, he/she had to travel from the joystick side of the joint box (right side from the monkey's view, zone 1), to the dispenser side of the joint box (monkey's left side, zone 3) to catch the pellet in time. There was no receptacle to catch the pellet; therefore, if the monkey did not arrive in time, he/she would forfeit the reward. No pellet would be dispensed if the trial was incorrect. Thus, on incorrect trials, it would be more energy efficient for the monkey to remain on the joystick side of the box (zone 1) to immediately begin the subsequent trial. On correct trials, being at the dispenser in time was necessary to obtain the

2.3.2. Training Phase 2 Following Phase 1, a divider was inserted from the front of the testing box into the middle of zone 2. This effectively closed off twothirds of the box so that the monkeys had to travel around the barrier to get to the dispenser. From the monkey's perspective, he/she would have to travel from the front-right side of the box in zone 1 where they completed the computer trials (same tasks as in Phase 1), past the back center of the box to cross the divider, and then to the front-left side of the box to retrieve the food rewards from zone 3 (i.e., a curved horseshoe-like path if viewed from above). Phase 2 continued until the monkeys completed at least one session of 100 or more trials. The tasks

2.2. Apparatus The experimental space was a 61 × 145 × 71 cm mesh box with a removable divider (removed during some phases of training) in the middle section that functioned as a wall but did not completely close off the two sides (Fig. 1). The monkeys operated the Language Research Center's modified Computerized Test System (LRC-CTS; Evans et al., 2008) to participate in this experiment. This system consisted of a personal computer with color monitor, digital joystick, and food pellet dispenser that was modified to drop the pellet reward (~112 cm) from the joystick. The computer monitor was 17 in. Stimuli were 177 pixels high and 193 pixels wide at a screen resolution of 800 × 600. The monkeys manipulated a joystick outside of the cage by reaching through the cage mesh or through a hole in the mesh designed to allow access to the joystick. The joystick controlled a cursor on a computer monitor, and the computer was programmed to deliver 45-mg banana flavored sucrose pellets (Bioserve, Frenchtown, NJ). The dispenser interfaced to the computer using an ADU200 USB relay I/O interface. Fig. 1 shows how we divided the experimental space into three zones (not visible to the monkeys) to assist in coding the monkey's movement behavior. Zone 1 included the joystick and computer task, zone 2 included the middle portion of the testing box, and zone 3 included the pellet dispenser where food was dispensed following a correct response.

4

Cognition 199 (2020) 104237

T.R. Smith, et al.

(which led to it disappearing), and then the match stimuli were presented. One of the four comparison stimuli was randomly chosen to be the correct option; thus, the monkeys had a 25% chance of earning a pellet by guessing on these trials, as they could not know what the sample had been since no sample was presented.

again were easy, well-known tasks that rarely led to errors. 2.3.3. Training Phase 3 Once the monkeys demonstrated that they would work under Phase 2 conditions, the training tasks were replaced with a four-choice matching-to-sample (MTS) task. This task started with the presentation of the centrally located colored clipart icon (the sample) and the cursor. The clipart set had 500 items and the program selected the items at random (with replacement). Once the cursor was moved to contact the sample, four comparison stimuli appeared in the four corners of the screen and the cursor was re-centered. The cursor and sample clipart icon remained in those positions until the monkey made his/her selection. If the monkey moved the cursor to contact the comparison icon that was identical to the sample icon, then a correct choice was registered. Following a correct response, the screen cleared of all stimuli and, after 2 s, the screen background flashed between green and white coloration. After 3 s from the response (i.e., 1 s after the flashing started), a chime sounded and a pellet was dispensed simultaneously. This meant that the monkey could have as long as 3 s to move to the dispenser if it began moving as soon as the response was completed, or as short as 1 s if it waited until feedback was given. This one second interval was short enough that the monkey could only get to the dispenser if he/she moved directly and rapidly to it as soon as the feedback was given, although it was unlikely that they were able to catch the pellet if they waited until after the tone sounded to move. Any hesitation or slowness of movement and the pellet would be forfeited. If the monkey moved the cursor to contact a comparison icon that did not match the sample, an incorrect choice was registered, and the screen cleared all stimuli for 2 s. Following the 2-s delay, the screen flashed between red and white coloration. After a 3-s delay (i.e., 1 s after the flashing started), a buzz sounded without the dispensing of a pellet. Phase 3 continued until monkeys completed 100 trials in a session and had > 80% accuracy on the task.

2.4. Video coding The monkeys' expression of confidence in their MTS responses occurred in the 2-s interval between the response and the screen-flash indicating the outcome. Traveling from the joystick to the pellet dispenser was the defined confidence movement response (i.e., a “go” response). Remaining at the joystick (i.e., a “no go” response) was considered to be a low confidence response. To measure these movements, each session was video recorded, and two observers coded the videos to determine on which trials the monkeys moved from the joystick location to the pellet-delivery location and at what point in the trial they did so. The camera was located and centered outside of the enclosure, facing towards the divider, and recorded continuously throughout the trial (see Fig. 1 for the coder's perspective). Sessions were viewed after their completion. At the moment the screen flashed, the coders recorded which of three possible zones the subject's shoulders were located in (Fig. 1). Zone 1 was the left-most part of the joint box (from the camera's perspective), where the joystick was located. Zone 2 was the middle section of the box, surrounding the divider. Zone 3 was the right-most section of the box, where the pellets were dispensed. Coders also recorded whether the monkeys moved from the joystick side of the box to the pellet side of the box during the delay interval that separated the sample and the presentation of comparison stimuli. In the no-sample trials, this interval separated the trialstart selection and the presentation of the comparison stimuli. If the subject's shoulders crossed the divider at any point during the delay interval, then it was recorded that the monkey moved early. The computer task did not detect nor artificially adjust the DMTS task if the monkey moved early. A natural consequence of being away from the monitor during the retention interval was occasionally missing the presentation of comparison stimuli. Coders recorded whether the monkeys prematurely moved during the delay interval (before the choice options had even been presented) – this would be an erroneous movement as the trial had not yet been completed and thus, no rewards were delivered at this point in the trial. These movements were included in the analysis as an index of a monkey's disposition to move around in the enclosure. For inter-rater reliability, a sample of videos was coded by an individual who was trained to code the videos but was naïve to the experimental question. Three randomly chosen sessions were scored from each of the five monkeys. Cohen's kappa (Hallgren, 2012) was used to test the reliability between the two raters. Coder interrater reliability between two raters (1400 observations) was strong for both the scoring on the premature movements (95% agreement; κ = 0.89, p < .001) and the confidence movements (93% agreement; κ = 0.87, p < .001).

2.3.4. Experimental Phase 1 The first experimental phase tested monkey metacognition by inserting a more challenging delayed matching-to-sample (DMTS) trial designed to reduce levels of confidence necessary to assess metacognitive states. This phase was composed of up to 100-trial sessions that were identical to the procedure described in Training Phase 3, except now operated as a DMTS task. In the DMTS the sample stimulus disappeared after it was contacted, and a variable delay interval occurred before the four comparison stimuli were presented. The possible delay intervals were 1, 2, 4, 7, and 10-s. The delay interval was randomly selected on each trial. For ease of later coding of the videos, at the start of each trial, the trial number flashed on the screen for 1 s. Experimental Phase 1 continued until at least 500 trials were completed. Sessions where the monkeys did not complete at least 40 trials were not included in analysis and did not count towards the 500-trial requirement. This was because those sessions were considered to reflect low motivation on the monkey's part to engage the task. 2.3.5. Experimental Phase 2 The second experimental phase was designed to assess confidence movements on “impossible” trials where a sample was not presented and the monkey had to guess. The procedure was similar to the first experimental phase, except that: (1) each trial started with a rectangular stimulus displaying the trial number (to aid later coding of the trials) and the monkeys had to contact that stimulus with the cursor to begin the trial. Immediately after the cursor contacted the trial-start stimulus, it disappeared, and the sample was presented in the middle of the screen and remained until the monkey moved the cursor to contact it. After contacting the sample stimulus, it disappeared, and four comparison stimuli were presented in each corner of the screen. (2) Approximately 20% of the trials did not include a sample. On no-sample trials, a 1-s delay occurred after the monkeys contacted the trial-start

2.5. Data analysis Sessions with 50 or fewer trials completed were not included in the analysis because this typically reflects lack of motivation by our monkeys to engage in a computerized task. Sessions were repeated until monkeys completed 5 sessions of > 40 trials in Experimental Phase 1 and also in Experimental Phase 2. Because of incomplete sessions, Griffin repeated two sessions, Nkima repeated five sessions, Logan repeated one session, Liam repeated three sessions, and Nala repeated three sessions. Table 1 shows the total number of trials completed per subject. Only the trials where the monkeys were coded to have remained at the computer station during the retention interval (i.e., no premature responses) were included in the analyses. See Supplementary materials for data including performances with premature responses. 5

Cognition 199 (2020) 104237

T.R. Smith, et al.

Fig. 2. The proportion of correct DMTS responses as a function of the retention interval (obtained from the Sample-Accuracy model) and no sample (NS) trials (obtained from the NoSample-Accuracy model). The bars represent the group averages (fixed effects). The lines represent the individual subject estimates (random effects). Error bars represent the 95% CI for the fixed effects. Chance level of accuracy was at 0.25 in all cases.

predicted the probability of a confidence movement as a function of trial outcome and whether the trial had a sample or not (1-s delay sample trial vs. no-sample trial). The Response-Latency model analyzed the time it took to make a joystick-selection response (timed from the onset of the presentation of choice stimuli until a choice was selected). In the full model, the average response time was analyzed as a function of whether the monkey expressed a confidence movement (“go” or “no go”), the trial outcome (correct or incorrect), and the retention interval (1, 2, 4, 7, and 10 s) and their interactions. Response latencies longer than 120-s were excluded from the analysis given that these were extreme delays before responding (< 1% of the data were removed with this criterion). The model specified a gamma error distribution for the response time data to account for the positively skewed error distribution. The effect of no sample trials was evaluated with a model that only used 1-s retention interval trial data from Phase 2 and predicted the response latency as a function of confidence movement, trial outcome, and whether the trial had a sample or not (1-s delay sample trial vs. no-sample trial).

Generalized linear mixed effects models (lme4 package in R; Bates, Maechler, Bolker, & Walker, 2015) were fit to the accuracy data, confidence-movement data, and response-latency data (i.e., the time it took a monkey to choose a comparison option once those were presented). All independent variables were treated as a categorical variable and were effect-coded to avoid issues with collinearity in the interaction terms. The models included fixed effects to determine the generality of the investigated factors and random effects at the subject level in order to characterize the individual differences between the monkeys and to satisfy the repeated-measures assumption (Gelman & Hill, 2007). Tukey HSD post-hoc tests (using the lsmeans package in R) obtained pair-wise comparisons between the individual levels within the factors (α = 0.05). Each model was tested against a null model (a model that excluded a factor of interest) and likelihood ratio tests were reported to characterize the significance and magnitude of the effects (Wagenmakers & Farrell, 2004). The Accuracy model evaluated trial accuracy for all of the sample trials and assessed task accuracy as a function of experimental phase (phase 1 and 2) and retention interval (delay duration in the DMTS; 1, 2, 4, 7, and 10 s). A binomial error distribution was specified for the accuracy data (1 = correct, 0 = incorrect). The model was functionally a repeated measures logistic regression with factors for task monitoring, delay, and their interactions included into the full model. The full model was compared to two null models, one model excluding the task monitoring factor and the other model excluding the delay factor. The effect of no sample trials was evaluated by modeling trial accuracy between sample and no-sample trials. Only data from test sessions in Experimental Phase 2 with 1-s delays were evaluated. The full model predicted task accuracy against sample and no-sample trials and the task monitoring factor. Confidence movements were assessed by classifying all trials where monkeys were in zone 1 (at the computer-joystick location) as “no go” trials and zone 3 (at the pellet dispenser) as “go” trials (zone 2 trials were not included in this model because they may have reflected intermediate confidence or they may have reflected a general bias towards some small amount of movement by monkeys). Table 1 shows the percentage of zone 2 trials removed. More than 25% of the data was removed from monkeys Nala and Liam; however, the results did not change if the zone 2 trials were counted as no-go trials (see Supplementary materials). The Confidence model analyzed sample trials and assessed the proportion of trials with a confidence movement as a function of the retention interval (1, 2, 4, 7, and 10 s) and trial outcome (correct or incorrect). The full model was a repeated measures logistic regression (go = 1, no go = 0) with task monitoring, retention interval, trial outcome, and their interactions as factors. Three null models predicted the confidence movements with reduced models eliminating a factor of interest. The effect of no sample trials was evaluated with a model that only used 1-s retention interval trial data from Phase 2 and

3. Results 3.1. Task accuracy The full Accuracy model including the experimental phase factor was not superior to a reduced model excluding that factor (χ2(11) = 11.99, p = .41). The model including the delay interval factor was over 1000 times more likely than the null model (χ2(26) = 104, p < .001). Fig. 2 shows that accuracy systematically decreased across the delay intervals. Post-hoc tests revealed that accuracy was significantly higher at shorter delay intervals than longer delay intervals for the following comparisons; 1-s vs. 4-s (z = −3.17, p = .01), 1-s vs. 7-s (z = −6.02, p < .001), 1-s vs. 10-s (z = −5.91, p < .001), 2-s vs. 7-s (z = 5.13, p < .001), 2-s vs. 10-s (z = 4.80, p < .001), 4-s vs. 7-s (z = 3.28, p = .008), and 4-s vs. 10-s delays (z = 3.15, p = .01). Accuracy for no sample trials were lower than the 1-s delay sample trials in Experimental Phase 2 (24% vs. 81%, z = 11.20, p < .001). 3.2. Confidence movements The Confidence model including the experimental phase factor did not account for the data better than the reduced model excluding that factor (χ2(17) = 17.47, p = .42). The model with the retention interval factor did not account for the data better than the model excluding that factor (χ2(26) = 22.39, p = .66). However, the model including the outcome factor (correct/incorrect) was over 1000 times more likely than the null model (χ2(3) = 38.76, p < .001). Fig. 3 shows that the 6

Cognition 199 (2020) 104237

T.R. Smith, et al.

Fig. 3. The proportion of confidence movements (“go”) as a function of trial interval duration and no-sample trials. Bars represent the group averages (fixed effects) for correct/incorrect trial outcomes. Lines represent the individual subject estimates (random effects). Error bars represent the 95% CI of the fixed effects.

between a premature movement and a confidence movement, r (23) = 0.28, p = .16. The monkeys' tendency to go to the dispenser following an incorrect response may reflect the monkeys' overall general bias to make that movement, whereas the movements following a correct response were affected by their performance accuracy (and putatively their metacognitive awareness of their accuracy).

probability of a confidence movement was greater prior to a correct outcome compared to an incorrect outcome (z = 3.04, p = .002). There was a delay × outcome interaction where there was no difference between outcomes at the 10-s retention interval (z = −2.36, p = .01). A follow-up model comparing accuracy across delays and confidence movements reported (same as accuracy model with the confidence movement as a predictor, see Supplementary materials for details), revealed that accuracy was approximately 15% higher for go vs. no-go trials (z = −4.88, p < .001). There were no confidence movement by delay interactions (ps > .15), consistent with the confidence movement model's results (Fig. 3). When comparing confidence movements between sample and nosample trials, there was no main effect of outcome (z = 1.13, p = .25) or sample (z = 0.03, p = .97). There was a significant interaction: compared to trials with samples, the monkeys in no-sample trials were less likely to make a confidence movement for a correct (i.e., correctly guessed) trial and more likely to make a confidence movement for an incorrect trial (z = 2.16, p = .03). Observing the random-effect predictions for each individual monkey revealed that each monkey had biases to move (e.g., Griffin) or stay (e.g., Nala). These biases might simply be individual differences in the monkeys' tendency to move. If this was the case, then we would predict that the proportion of premature movements to zone 3 during the retention interval would correlate with the proportion of confidence movements. A model predicting the proportion of premature movements (as a function of retention interval and trial outcome) was run and the individual differences (random effects) were correlated (Pearson correlation) against the individual differences in proportion of confidence movements for correct and incorrect responses and retention interval. For trials where the monkeys made an incorrect choice, there was a positive correlation between premature movements and confidence movements, r(23) = 0.46, p = .02. However, when the monkeys made a correct response, there was no statistical relationship

3.3. Response latencies The full Response-Latency model was over 1000 times more likely than the models excluding the outcome factor (χ2(28) = 111.07, p < .001) and the confidence movement factor (χ2(28) = 59.93, p < .001). The full model was 60 times more likely than the reduced model excluding the retention interval factor (χ2(58) = 153.07, p < .001). There were no significant main fixed effects (ps > .08), but there were several significant interactions in the full model. The response latency tended to be shorter in trials where the monkeys made a correct response and made a confidence movement (confidence movement × outcome; z = 4.83, p < .001) and this relationship was greater in the 1-s retention interval than the 7-s retention interval (confidence movement × outcome × delay; z = −3.24, p = .001). Also, compared to the 1-s delay interval trials, the 10-s retention interval trials showed a greater tendency for there being longer response latencies prior to correct choices than incorrect choices (outcome × delay; z = 2.50, p = .01). Fig. 4 shows the response times as a function of correct/incorrect trial outcomes, go/no go confidence movements, and the retention interval. Overall, response latencies were longer prior to an incorrect response for trials where a confidence movement (i.e., “go”) was observed. Planned comparisons (only comparing latencies between correct and incorrect outcomes) revealed that for “go” trials, the response latency was shorter prior to correct (vs. incorrect) responses at 1-s (2.60 s vs. 4.06 s; z = −3.63, p < .001), 2-s (2.53 s vs. 3.83 s; Fig. 4. The average response latency (sec) as a function of the coded confidencemovement (“go” vs. “no go”) across the retention interval and no-sample (NS) trials. Bars represent the group averages (fixed effects) for correct and incorrect trial responses. Lines represent the individual subject estimates (random effects). Error bars represent the 95% CI for the fixed effects.

7

Cognition 199 (2020) 104237

T.R. Smith, et al.

z = −3.44, p < .001), and 4-s (2.45 s vs. 3.58 s; z = −3.06, p = .002) retention intervals, but not at the 7-s (2.59 s vs. 2.74 s; z = −0.47, p = .63) and 10-s (2.85 s vs. 3.45 s; z = −1.61, p = .10) retention intervals. There were no differences in response latency between correct and incorrect responses for the “no-go” trials across all retention intervals (ps > .05). For no-sample trials, there were no statistically significant difference in the main effect for go/no-go movements (z = 0.76, p = .44), outcome (z = 0.4, p = .68), and sample/no-sample trials (z = −1.12, p = .25). But, there was an outcome × sample/no-sample trials interaction (z = −3.09, p = .001) where no-sample trials did not have the longer latencies prior to an incorrect choice that was observed for the sample trials.

than correctly-completed sample trials but slightly more than on incorrect sample trials. For the movement data, the monkeys did engage in a number of confidence movements in these critical no-sample trials (about half of the opportunities), reflecting a proclivity for movement across conditions which may be reflective of the behavioral repertoire of capuchin monkeys. However, it also is important to note that there was clearly the expected pattern of differential movement as a function of trial outcome if monkeys were monitoring confidence in the matching performance, with more movements following correct versus incorrect trials for sample (1-s delay) trials. The latency data (reflecting the time from the onset of the presentation of choice stimuli until a choice was selected) produced an interesting pattern that we did not expect and that is not easily explained. For the shorter retention interval (i.e., “easy”) trials, when the monkeys displayed a confidence movement, the monkeys had differentially longer response latencies (> 3.5 s) prior to an incorrect response. However, for trials without a confidence movement, the monkeys often had a short response latency (< 3 s) and trials longer than 5s did not have latencies that differed between correct and incorrect outcomes. This pattern of responding might reflect a disassociation between trials in which the monkeys knew that they knew versus trials on which they perhaps guessed or were less certain that they were correct. The reasoning behind this claim would be as follows: The no-go trials might have produced equally fast latencies because the monkey “knew that they did not know” and just responded promptly as a guess and then remained at the station as an expression of low confidence in that guess. This conjecture is further supported by the equivalently fast response latencies for no-sample trials where accuracy was at chance. The go trials prior to a correct outcome produced short latencies because the monkeys “knew that they knew” and had confidence in their response. The go trials prior to the incorrect outcome had a long latency because the monkeys were perhaps on the threshold of knowing and the latency reflected the consideration of their response. The absence of this effect at longer retention-interval trials might be due to the greater difficulty in the trial, eliminating the threshold sense of knowing. It must be stressed that this is not an authoritative interpretation of these data, rather it is offered as a possible interpretation for future studies to consider when investigating the cognitive mechanisms moderating the relationship between confidence movements and response latencies. The latency data do provide one clear and important interpretative outcome; the monkeys' confidence movements were not strongly associated with short response latencies (and vice versa for long latencies). This is important because, if this pattern emerged, it might suggest an alternative (non-metacognitive) interpretation of the confidence movements. More specifically, the monkeys had long latencies because of the lack of immediate recollection of the correct stimulus item and then learned (associatively) that long latencies typically preceded an unrewarded trial and adjusted their movement behavior accordingly (Smith et al., 2012). This is an important finding in that the current paradigm not only captures instances of uncertainty (“no-go” responses) but also instances of certainty (“go” responses and response latency data described above). This distinction is important in that the majority of comparative paradigms used previously with capuchins monkeys exclusively relied upon the metacognitive monitoring of uncertainty through the use of the escape response (e.g., Beran et al., 2009) or the option to re-study information when facing difficulty (e.g., Basile et al., 2009). Thus, paradigms that provide subjects (both animal and pre-verbal humans) with the opportunity to indicate both uncertainty and certainty in their responding provide a vital tool in the study of metacognition. In the current task, monkeys advanced fairly rapidly through training to the experimental phases once they adjusted to receiving computer rewards at a distant location. In light of this, it is important to note that we selected a primary cognitive task (DMTS) that all monkeys were highly familiar with; thus, the monkeys were not required to divide attention between learning a new task and acclimating to the novel

4. Discussion To assess the metacognitive ability that underlies confidence ratings of nonhuman animals requires an objective metric that can demonstrate variable performance. Without variability in performance, it is difficult to assess whether confidence aligns with that performance. In our study, capuchin monkeys' performance on the delayed matching-tosample (DMTS) task declined as a function of retention interval with more errors at progressively longer delays. For no-sample trials, performance was at chance as predicted and as necessary to evaluate confidence movements independently of brief retention intervals (further discussed below). With regards to metacognitive monitoring, in trials with a sample, monkeys were significantly more likely to move prior to receiving computerized feedback on correct trials versus incorrect trials. Thus, monkeys responded on the basis of perceived performance and anticipated outcome, and these confidence movements tracked accuracy in the present memory task. The probability of making a confidence movement for incorrect trials was correlated with the probability of making a premature movement during the retention interval (i.e., prior to making a response to the trial). Thus, movements appeared to be a joint function of the monkeys' anticipation of reward and their individual disposition to move around the test environment. We also examined confidence movements as a function of delay length for the DMTS task (1-s, 2-s, 4-s, 7-s, and 10-s). Capuchin monkeys did not show a differential effect of the delay length on go vs. no-go movements. This is an interesting difference in comparison to chimpanzees (Beran et al., 2015) and children (James et al., under review) who, like capuchins, demonstrate more confidence movements for correct trials versus incorrect trials but also show an effect of delay length with more movements for shorter delays. This pattern of results observed with this movement-based paradigm mirrors the metacognitive literature among nonhuman primates, in which capuchin monkeys are capable of metacognitive monitoring but appear to do so with less precision and less flexibility than is seen among chimpanzees (Beran et al., 2014, 2016; Perdue et al., 2015; see Smith et al., 2009, Smith et al., 2018). This species-level difference may reflect a finer-tuned metacognitive monitoring ability in young children and the great apes versus New World monkeys. The addition of more species, especially additional New World and Old World primates, using similar paradigms is needed to fully address these differences. No-sample trials are a critical addition to a meta-memory task in that subjects cannot use the retention interval as a cue for trial difficulty and subsequent performance. Thus, a subject engaging in metamemory should respond to no-sample trials by not moving to the reward dispenser as, by definition, subjects will perform at chance level. However, a subject with a history of reinforcement for short interval trials with a sample may falsely respond to no-sample trials as if they were associated with food reward, thus erroneously moving to the reward dispenser in anticipation of a reward. Confidence movement data following these trials in the current study yielded interesting and slightly mixed results, in that monkeys moved slightly less in no-sample trials 8

Cognition 199 (2020) 104237

T.R. Smith, et al.

reward delivery setup. Future studies that employ novel primary tasks will indicate whether capuchin monkeys maintain the current level of responding in the face of more difficult tasks or if metacognitive responding emerges secondary to learning the primary task. It would be an important next step to assess generalization of this form of confidence-movement response in the same way that other research teams have shown application of other measures of metacognition in monkeys (e.g., Kornell et al., 2007; Washburn, Smith, & Shields, 2006). What we can conclude from the current study is that capuchin monkeys can demonstrate differential monitoring as a function of task performance, albeit to a less precise manner than is observed with chimpanzees and children in the same paradigm. When monkeys were correct, they were more likely to move towards the reward delivery location than when they were incorrect, and they were less likely to move towards the reward for correct no-sample trials than correct sample trials. However, unlike chimpanzees, capuchin monkeys did not show fewer confidence movements on incorrect no-sample trials when the monkeys should have had low metacognitive confidence in their responses. Thus, capuchin metacognitive performances are comparatively less situationally appropriate than the performances of chimpanzees in this task. We conclude by noting an important point made by Shields et al. (2005): “A burden falls on the experimenter to arrange carefully the methods of a confidence-monitoring experiment so that animal participants will be strongly motivated to adopt a metacognitive strategy if they are able. This can be a complex matter to judge in planning an experiment, and it may be a common problem for task grammars to motivate a metacognitive strategy too weakly for the animal to struggle to adopt it” (p. 184). We believe the confidence movement paradigm affords this motivation to animals and could perhaps be used with numerous species to broader the assessment of metacognitive abilities in nonhuman animals.

Basile, B. M., Hampton, R. R., Suomi, S. J., & Murray, E. A. (2009). An assessment of memory awareness in tufted capuchin monkeys (Cebus apella). Animal Cognition, 12, 169–180. Basile, B. M., Schroeder, G. R., Brown, E. K., Templer, V. L., & Hampton, R. R. (2015). Evaluation of seven hypotheses for metamemory performance in rhesus monkeys. Journal of Experimental Psychology: General, 144, 85–102. Bates, D., Maechler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67, 1–48. https://doi.org/10. 18637/jss.v067.i01. Belger, J., & Bräuer, J. (2018). Metacognition in dogs: Do dogs know they could be wrong? Learning & Behavior, 46, 398–413. Beran, M. J., Brandl, J., Perner, J., & Proust, J. (Eds.). (2012). Foundations of metacognition. Oxford, UK: Oxford University Press. Beran, M. J., Perdue, B. M., Church, B. A., & Smith, J. D. (2016). Capuchin monkeys (Cebus apella) modulate their use of an uncertainty response depending on risk. Journal of Experimental Psychology: Animal Learning and. Cognition, 42, 32–43. Beran, M. J., Perdue, B. M., Futch, S. E., Smith, J. D., Evans, T. A., & Parrish, A. E. (2015). Go when you know: Chimpanzees’ confidence movements reflect their responses in a computerized memory task. Cognition, 142, 236–246. Beran, M. J., Perdue, B. M., & Smith, J. D. (2014). What are my chances? Closing the gap in uncertainty monitoring between rhesus monkeys (Macaca mulatta) and capuchin monkeys (Cebus apella). Journal of Experimental Psychology: Animal Learning and. Cognition, 40, 303–316. Beran, M. J., & Smith, J. D. (2011). Information seeking by rhesus monkeys (Macaca mulatta) and capuchin monkeys (Cebus apella). Cognition, 120, 90–105. Beran, M. J., Smith, J. D., Coutinho, M. V., Couchman, J. J., & Boomer, J. (2009). The psychological organization of “uncertainty” responses and “middle” responses: A dissociation in capuchin monkeys (Cebus apella). Journal of Experimental Psychology: Animal Behavior Processes, 35, 371–381. Beran, M. J., Smith, J. D., & Perdue, B. M. (2013). Language-trained chimpanzees (Pan troglodytes) name what they have seen but look first at what they have not seen. Psychological Science, 24, 660–666. Brown, E. K., Basile, B. M., Templer, V. L., & Hampton, R. R. (2019). Dissociation of memory signals for metamemory in rhesus monkeys (Macaca mulatta). Animal Cognition, 22, 331–341. Brown, E. K., Templer, V. L., & Hampton, R. R. (2017). An assessment of domain-general metacognitive responding in rhesus monkeys. Behavioural Processes, 135, 132–144. Call, J. (2010). Do apes know that they could be wrong? Animal Cognition, 13, 689–700. Call, J., & Carpenter, M. (2001). Do apes and children know what they have seen? Animal Cognition, 4, 207–220. Carruthers, P. (2008). Meta-cognition in animals: A skeptical look. Mind & Language, 23, 58–89. Carruthers, P. (2009). How we know our own minds: The relationship between mindreading and metacognition. Behavioral and Brain Sciences, 32, 121–182. Castro, L., & Wasserman, E. A. (2013). Information-seeking behavior: Exploring metacognitive control in pigeons. Animal Cognition, 16, 241–254. Comstock, G., & Bauer, W. A. (2018). Getting it together: Psychological unity and deflationary accounts of animal metacognition. Acta Analytica, 33, 431–451. Crystal, J. D. (2014). Where is the skepticism in animal metacognition? Journal of Comparative Psychology, 128, 152–154. Crystal, J. D., & Foote, A. L. (2009). Metacognition in animals: Trends and challenges. Comparative Cognition and Behavior Reviews, 4, 54–55. Dunlosky, J., & Bjork, R. A. (Eds.). (2008). Handbook of memory and metamemory. New York: Psychology Press. Evans, T. A., Beran, M. J., Chan, B., Klein, E. D., & Menzel, C. R. (2008). An efficient computerized testing method for the capuchin monkey (Cebus apella): Adaptation of the LRC-CTS to a socially housed nonhuman primate species. Behavior Research Methods, 40, 590–596. Flavell, J. H. (1979). Metacognition and cognitive monitoring: A new area of cognitivedevelopmental inquiry. American Psychologist, 34, 906–911. Foote, A. L., & Crystal, J. D. (2007). Metacognition in the rat. Current Biology, 17, 551–555. Foote, A. L., & Crystal, J. D. (2012). “Play it again”: A new method for testing metacognition in animals. Animal Cognition, 15, 187–199. Fujita, K. (2009). Metamemory in tufted capuchin monkeys (Cebus apella). Animal Cognition, 12, 575–585. Gelman, A., & Hill, J. (2007). Data analysis using regression and multilevel/hierarchical models. Cambridge university press. Hallgren, K. A. (2012). Computing inter-rater reliability for observational data: An overview and tutorial. Tutorial in Quantitative Methods for Psychology, 8, 23–34. Hampton, R. R. (2001). Rhesus monkeys know when they remember. Proceedings of the National Academy of Sciences, 98, 5359–5362. Hampton, R. R. (2009). Multiple demonstrations of metacognition in nonhumans: Converging evidence or multiple mechanisms? Comparative Cognition and Behavior Reviews, 4, 17–28. Hampton, R. R., Zivin, A., & Murray, E. A. (2004). Rhesus monkeys (Macaca mulatta) discriminate between knowing and not knowing and collect information as needed before acting. Animal Cognition, 7, 239–246. Inman, A., & Shettleworth, S. J. (1999). Detecting metamemory in nonverbal subjects: A test with pigeons. Journal of Experimental Psychology: Animal Behavior Processes, 25, 389–395. Iwasaki, S., Watanabe, S., & Fujita, K. (2013). Do pigeons (Columba livia) seek information when they have insufficient knowledge? Animal Cognition, 13, 211–221. Kirk, C. R., McMillan, N., & Roberts, W. A. (2014). Rats respond for information: Metacognition in a rodent? Journal of Experimental Psychology: Animal Learning and. Cognition, 40, 249–259.

CRediT authorship contribution statement Travis R. Smith: Data curation, Formal analysis, Investigation, Methodology, Project administration, Visualization, Writing - original draft, Writing - review & editing. Audrey E. Parrish: Conceptualization, Methodology, Writing - original draft, Writing - review & editing. Courtney Creamer: Investigation, Writing - review & editing. Mattea Rossettie: Data curation, Investigation, Methodology, Project administration, Writing - original draft, Writing - review & editing. Michael J. Beran: Conceptualization, Funding acquisition, Methodology, Project administration, Resources, Software, Supervision, Writing - original draft, Writing - review & editing. Declaration of competing interest None. Acknowledgements This research was supported by the National Science Foundation Grant BCS 1552405. We thank the animal care and enrichment team at the Language Research Center. Appendix A. Supplementary material A supplementary DOCX file (including the R scripts and supplementary analyses) and the raw data XLS file are shared at the Open Science Framework repository: https://osf.io/45ew8. References Adams, A., & Santi, A. (2011). Pigeons exhibit higher accuracy for chosen memory tests than for forced memory tests in duration matching-to-sample. Learning & Behavior, 39, 1–11.

9

Cognition 199 (2020) 104237

T.R. Smith, et al.

Royal Society B, 367, 1297–1309. Smith, J. D., Coutinho, M. V. C., Church, B., & Beran, M. J. (2013). Executive-attentional uncertainty responses by rhesus monkeys (Macaca mulatta). Journal of Experimental Psychology: General, 142, 458–475. Smith, J. D., Redford, J. S., Beran, M. J., & Washburn, D. A. (2010). Rhesus monkeys (Macaca mulatta) adaptively monitor uncertainty while multi-tasking. Animal Cognition, 13, 93–101. Smith, J. D., Schull, J., Strote, J., McGee, K., Egnor, R., & Erb, L. (1995). The uncertain response in the bottlenosed dolphin (Tursiops truncatus). Journal of Experimental Psychology: General, 124, 391–408. Smith, J. D., Shields, W. E., Schull, J., & Washburn, D. A. (1997). The uncertain response in humans and animals. Cognition, 62, 75–97. Smith, J. D., Zakrzewski, A. C., & Church, B. A. (2016). Formal models in animal-metacognition research: The problem of interpreting animals’ behavior. Psychonomic Bulletin & Review, 23, 1341–1353. Smith, T. R., Smith, J. D., & Beran, M. J. (2018). Not knowing what one knows: A meaningful failure of metacognition in capuchin monkeys. Animal Behavior and Cognition, 5, 55–67. Sole, L. M., Shettleworth, S. J., & Bennett, P. J. (2003). Uncertainty in pigeons. Psychonomic Bulletin & Review, 10, 738–745. Suda-King, C. (2008). Do orangutans (Pongo pygmaeus) know when they do not remember? Animal Cognition, 11, 21–42. Suda-King, C., Bania, A. E., Stromberg, E. E., & Subiaul, F. (2013). Gorillas’ use of the escape response in object choice memory tests. Animal Cognition, 16, 65–84. Sutton, J. E., & Shettleworth, S. J. (2008). Memory without awareness: Pigeons do not show metamemory in delayed matching to sample. Journal of Experimental Psychology: Animal Behavior Processes, 34, 266–282. Templer, V. L., & Hampton, R. R. (2012). Rhesus monkeys (Macaca mulatta) show robust evidence for memory awareness across multiple generalization tests. Animal Cognition, 15, 409–419. Templer, V. L., Lee, K. A., & Preston, A. J. (2017). Rats know when they remember: Transfer of metacognitive responding across odor-based delayed match-to-sample tests. Animal Cognition, 20, 891–906. Vining, A. Q., & Marsh, H. L. (2015). Information seeking in capuchins (Cebus apella): A rudimentary form of metacognition? Animal Cognition, 18, 667–681. Wagenmakers, E. J., & Farrell, S. (2004). AIC model selection using Akaike weights. Psychonomic Bulletin & Review, 11, 192–196. Washburn, D. A., Smith, J. D., & Shields, W. E. (2006). Rhesus monkeys (Macaca mulatta) immediately generalize the uncertain response. Journal of Experimental Psychology: Animal Behavior Processes, 32, 185–189. Watanabe, A., & Clayton, N. S. (2016). Hint-seeking behaviour of western scrub-jays in a metacognition task. Animal Cognition, 19, 53–64. Yuki, S., & Okanoya, K. (2017). Rats show adaptive choice in a metacognitive task with high uncertainty. Journal of Experimental Psychology: Animal Learning and. Cognition, 43, 109–118. Zakrzewski, A. C., Perdue, B. M., Beran, M. J., Church, B. A., & Smith, J. D. (2014). Cashing out: The decisional flexibility of uncertainty responses in rhesus macaques (Macaca mulatta) and humans (Homo sapiens). Journal of Experimental Psychology: Animal Learning and. Cognition, 40, 490–501. Zentall, T. R., & Stagner, J. P. (2010). Pigeons prefer conditional stimuli over their absence: A comment on Roberts et al. (2009). Journal of Experimental Psychology: Animal Behavior Processes, 36, 506–509.

Kornell, N. (2009). Metacognition in humans and animals. Current Directions in Psychological Science, 18, 11–15. Kornell, N. (2014). Where is the “meta” in animal metacognition? Journal of Comparative Psychology, 128, 143–149. Kornell, N., Son, L. K., & Terrace, H. S. (2007). Transfer of metacognitive skills and hint seeking in monkeys. Psychological Science, 18, 64–71. Le Pelley, M. E. (2012). Metacognitive monkeys or associative animals? Simple reinforcement learning explains uncertainty in nonhuman animals. Journal of Experimental Psychology. Learning, Memory, and Cognition, 38, 686–708. Malassis, R., Gheusi, G., & Fagot, J. (2015). Assessment of metacognitive monitoring and control in baboons (Papio papio). Animal Cognition, 18, 1347–1362. Marsh, H. L., & MacDonald, S. E. (2012a). Information seeking by orangutans: A generalized search strategy? Animal Cognition, 15, 293–304. Marsh, H. L., & MacDonald, S. E. (2012b). Orangutans (Pongo abelii) “play the odds”: Information-seeking strategies in relation to cost, risk, and benefit. Journal of Comparative Psychology, 126, 263–278. Metcalfe, J., & Shimamura, A. P. (Eds.). (1994). Metacognition: Knowing about knowing. Cambridge, MA: MIT Press. Morgan, G., Kornell, N., Kornblum, T., & Terrace, H. S. (2014). Retrospective and prospective metacognitive judgments in rhesus macaques (Macaca mulatta). Animal Cognition, 17, 249–257. Nakamura, N., Watanabe, S., Betsuyaku, T., & Fujita, K. (2011). Do birds (pigeons and bantams) know how confident they are of their perceptual decisions? Animal Behaviour, 14, 83–93. Nelson, T. O. (Ed.). (1992). Metacognition: Core readings. Toronto: Allyn and Bacon. Nelson, T. O., & Narens, L. (1990). Metamemory: A theoretical framework and new findings. In G. Bower (Vol. Ed.), The psychology of learning and motivation: Advances in research and theory. 26. The psychology of learning and motivation: Advances in research and theory (pp. 125–173). New York: Academic Press. Paukner, A., Anderson, J. R., & Fujita, K. (2006). Redundant food searches by capuchinmonkeys (Cebus apella): A failure of metacognition? Animal Cognition, 9, 110–117. Perdue, B.M., Church, B.A., Smith, J.D., & Beran, M.J. (2015). Exploring potential mechanisms underlying the lack of uncertainty monitoring in capuchin monkeys. International Journal of Comparative Psychology, Article 28. Roberts, W. A., Feeney, M. C., McMillan, N., MacPherson, K., Musolino, E., & Petter, M. (2009). Do pigeons (Columba livia) study for a test? Journal of Experimental Psychology: Animal Behavior Processes, 35, 129–142. Schwartz, B.L. (Ed.) (2009). Applied metacognition. Cambridge: Cambridge University Press. Shields, W. E., Smith, J. D., Guttmannova, K., & Washburn, D. A. (2005). Confidence judgments by humans and rhesus monkeys. Journal of General Psychology, 132, 165–186. Smith, J. D. (2009). The study of animal metacognition. Trends in Cognitive Sciences, 13, 389–396. Smith, J. D., Beran, M. J., Couchman, J. J., Coutinho, M. V. C., & Boomer, J. (2009). The curious incident of the capuchins. Comparative Cognition and Behavior Reviews, 4, 47–50. Smith, J. D., Beran, M. J., Redford, J. S., & Washburn, D. A. (2006). Dissociating uncertainty responses and reinforcement signals in the comparative study of uncertainty monitoring. Journal of Experimental Psychology: General, 135, 282–297. Smith, J. D., Couchman, J. J., & Beran, M. J. (2012). The highs and lows of theoretical interpretation in animal-metacognition research. Philosophical Transactions of the

10