Behavioural Brain Research 191 (2008) 137–140
Contents lists available at ScienceDirect
Behavioural Brain Research journal homepage: www.elsevier.com/locate/bbr
Short communication
Impaired instrumental choice in crib-biting horses (Equus caballus) Matthew Parker a,∗ , Edward S. Redhead a , Deborah Goodwin a , Sebastian D. McBride b a b
School of Psychology, University of Southampton, Highfield, Southampton, Hampshire SO17 1BJ, UK Institute of Rural Sciences, University of Wales, Llanbadarn Fawr, Aberystwyth, Ceredigion, SY23 3AL, Wales, UK
a r t i c l e
i n f o
Article history: Received 11 October 2007 Received in revised form 6 March 2008 Accepted 10 March 2008 Available online 16 March 2008 Keywords: Horse Stereotypy Striatum Dopamine Concurrent-chain schedules Choice
a b s t r a c t Horses displaying an oral stereotypy were tested on an instrumental choice paradigm to examine differences in learning from non-stereotypic counterparts. Stereotypic horses are known to have dysfunction of the dorsomedial striatum, and lesion studies have shown that this region may mediate response-outcome learning. The paradigm was specifically applied in order to examine learning that requires maintenance of response–outcome judgements. The non-stereotypic horses learned, over three sessions, to choose a more immediate reinforcer, whereas the stereotypic horses failed to do so. This suggests an initial behavioural correlate for dorsomedial striatum dysregulation in the stereotypy phenotype. © 2008 Elsevier B.V. All rights reserved.
Spontaneous stereotypies are repetitive and often topographically invariant behaviours performed by a significant minority of domesticated and captive animal species [14]. Because of their relative scarcity in non-domestic feral or semi-feral populations, the presence of stereotypy is often highlighted as an indicator of poor welfare [26]. Stabled horses have been shown to display a range of oral and locomotory stereotypic behaviours, the most common of which is crib-biting [23]. Crib-biting is described as the horse grasping an object between the incisor teeth, inhaling air into the oesophagus, and contiguously emitting an audible ‘grunt’ [17]. Dopaminergic pathways within the central nervous system (CNS) have been identified as a critical mediator for both development and maintenance of psychostimulant- and environmentally induced (spontaneous) stereotypies [5]. In particular, dysregulation of dopaminergic systems within the striatum linked with stresssensitisation [2,3,7,8,20] appears to be particularly pertinent to this behavioural condition. Recent work in the horse suggests that similar basal ganglia dysregulation is associated with the Equine Oral Stereotypy (crib-biting) Phenotype [16]. Crib-biting horses were reported to have significantly lower dopamine (DA) D1-like receptor sub-types in the caudate (dorsomedial striatum; DMS) and significantly higher D1-like and D2-like receptor subtypes in the nucleus accumbens (ventral striatum). Given the long-term potentiation characteristics of dopaminergic systems within the striatum
∗ Corresponding author. Tel.: +44 23 8059 5078; fax: +44 23 8059 4597. E-mail address:
[email protected] (M. Parker). 0166-4328/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.bbr.2008.03.009
[9,19], these results suggest a reduction in transmission of the medial aspect of the nigrostriatal pathway (substantia nigra to caudate/putamen) with concurrent upregulation of the mesoaccumbens pathway (ventral tegmentum area to nucleus accumbens). These results are interesting given the functional dissociation of the ventral striatum, DMS, and dorsolateral striatum (DLS; putamen) during the various stages of instrumental task learning. Specifically, the ventral striatum appears to be crucial in ‘goal-directed’ learning, possibly by mediating the effects of impending reinforcers [4,27]. DMS has been linked to the subsequent response–outcome (R–O) process: this describes, upon acquisition of an instrumental task, the two-way dynamic of outcome and response interdependency where the devaluation of the outcome of goal-directed behaviour affects subsequent responses when the same stimuli/states are applied [28]. Conversely, DLS is considered to pertain to stimulus–response (S–R) associations, which are independent of outcome devaluation, are an artefact of overtraining, and are often referred to as ‘habitual’ responding [1,18,27]. It could be predicted, therefore, that the equine oral stereotypy phenotype will respond differently within the different phases of learning associated with free-operant instrumental choice, as compared to control animals. The aim of this study, therefore, was to test this hypothesis by assessing response differences during the phase of learning (R–O) functionally associated the DMS, which is dysregulated in the Equine Stereotypy Phenotype. This was carried out by employing a concurrent chain schedule paradigm (see Fig. 1 and Ref. [15]), which offers specifically the opportunity to examine the subject’s subjective choice regarding reinforcer
138
M. Parker et al. / Behavioural Brain Research 191 (2008) 137–140
value. Within the context of concurrent chains choice, subjects are required to maintain R–O judgements in order to learn the reinforcement contingencies successfully. The procedure requires that subjects choose between two stimuli that predict different reinforcer outcomes, where one outcome will deliver reinforcement more immediately. A comparison of the concurrent chain performance of crib-biting and control horses was carried out over three 0.5-h learning sessions. It was predicted that crib-biters and non-crib-biters would differ in their discrimination between alternatives over the three sessions, with the crib-biters failing to learn the contingency effectively. Eight horses (four crib-biting and four controls) were recruited for the study. All subjects were housed in individual stables, and were regularly given access to paddocks. All subjects were fed a diet of commercial low energy feed, and had ad libitum access to hay and water. The stable bedding was rubber matting with additional wheat straw. The subjects’ crib-biting was assessed prior to data collection (see Ref. [11] for detailed crib-biting screening procedure). The operant device consisted of two lights, each comprised of four multi-function LEDs, which could either be off, red, green, or white. Two muzzle-plates were constructed from 15 cm × 7.5 cm steel, behind which were located two roller-actuator microswitches. The specified operation force for the switches was 100 g (∼1 N). Each depression of either muzzle-plate was programmed to log a response. The apparatus was designed around a Mitsubishi Alpha programmable logic chip (1998), and schedules were generated and response data logged via a C++ Builder programme. Data were logged into a spreadsheet via a computer in a room adjacent to the testing stable. Reinforcers were delivered via a stepper motor-controlled conveyor belt, located behind the food hopper (out of sight of the subjects). The device was mounted on the stable wall, with the food receptor trough located below. Horses are naturally trickle feeders, eating for between 16 and 18 h in a typical day [10,22]. Therefore, it was not considered necessary to deprive the subjects of food prior to the study, as, their motivation to acquire food would be sufficiently high regardless of a period of pre-trial deprivation. The reinforcers used were 15 g of pelleted feed (Dodson and Horrell Pasture Nuts® ). To our knowledge preference assessments for this brand of pelleted feed have not been empirically tested; however, response rates during training were sufficiently high to suggest that the horses found this feed motivationally salient. In general, pelleted feeds have been shown empirically to be highly palatable [10,12]. All training sessions lasted 30 min and were carried out in an empty stable. Initially, the horses were shaped to press the muzzle-plates by a process of successive approximations. Following shaping, the two white lights were illuminated continuously, and responses were reinforced according to a continuous reinforcement (CRF) schedule and subsequently on a progressive ratio (PR) schedule (each ratio on a separate day). This was set at concurrent (conc) fixed-rate (FR)-1 FR-1, conc FR-2 FR-2, conc FR-4 FR-4 and conc FR-8 FR-8 schedules for each horse. The final stage was designed to ensure that both of the alternatives were sampled. Trials commenced with the red and the green lights illuminated concurrently. On each of the alternatives, 40 reinforcers were available according to a CRF schedule. Once the reinforcers from one of the alternatives were expended, the corresponding light became darkened and inoperative. The other alternative remained active until all of the 40 reinforcers had been depleted. Each horse was trained on this for eight sessions (i.e., 40 reinforcers per alternative, per session). Typically in concurrent chain schedules preparations, the subject is presented with two illuminated response keys in each trial (see Fig. 1 for a schematic). During the initial-link (choice phase),
Fig. 1. Schematic illustration of a concurrent chain trial. Initially, two white lights (W) are illuminated. During this time, subjects can respond to either key. This phase is organised according to concurrent VI VI schedules. Responding on each of the alternatives is reinforced with entry into one of two, mutually exclusive terminallinks, dependent on the final response in the initial-links phase (i.e., response on the left initial-link light leads to entry into the left terminal-link [R], response on the right initial-link leads to entry into the right terminal-link [G]). In the terminallink phase, one key is illuminated red or green, and the other becomes dark and inoperative. The terminal-link alternatives are reinforced according to differential FI FI schedules. Following reinforcement, the initial-links are reinstated [W = white light; G = green light; R = red light].
responses are conditionally reinforced on a predetermined variable interval (VI) schedule, by entry to the terminal-link phase. The terminal-links are mutually exclusive, in that responses on each are reinforced differentially according to exclusive FI FI schedules. Terminal-link entry is determined by the final response in the initial-link phase. For example, if the final response in the initiallinks was to the left key, the left key would turn red, and the right key would be darkened. Following reinforcement, the initial-link is reinstated and the procedure is repeated. The relative value of the terminal-links is therefore operationalised as the relative rate of responses during the initial-links on each alternative. In the present study, subjects were exposed to equal initial-link VI VI schedules, and unequal terminal-link FI FI schedules, to assess sensitivity to the more immediate terminal-link reinforcer. The terminal-link schedules were counterbalanced between subjects and set at FI-20 s and FI-10 s (i.e., 2:1 ratio). FI schedules were used to ensure the terminal-link schedules were clearly differential over relatively few exposures. Previous research has shown that subjects show more extreme preference for different FI as compared to VI terminal-link schedules [6]. In total each subject took part in three 30-min choice sessions. All data were analysed in SPSS® 14. For analysis, we analysed only the responses from the last third of the sessions. The reason for this was to ensure that the subject was responding at a ‘steady-state’. This is typical for instrumental choice procedures (e.g., see Refs. [5,15]). Behavioural observations outside of the trial periods confirmed that the crib-biting horses all performed crib-biting regularly. In addition, it was confirmed that none of the control group performed crib-biting. No members of the control group or crib-biting group were observed to perform any other type of stereotypy during the pre-trial observations. Shaping took marginally longer for crib-biters (M = 45 min, S.D. = 17.3) than for controls (M = 37.5 min, S.D. = 15) but the difference was not significant, t(6) < 1. Fig. 2 displays the mean proportion (±S.E.) of on-target responses in the crib-biters and non-crib-biters during the final
M. Parker et al. / Behavioural Brain Research 191 (2008) 137–140
Fig. 2. Mean (±S.E.) proportion of responses to the shorter (FI 10-s) initial-link by crib-biters and non-crib-biters in a concurrent chain paradigm. *P < 0.05.
third of the three 0.5-h experimental sessions. As is clear, the noncrib-biters allocated relatively more responses to the shorter immediacy ratio across the three sessions, where the crib-biters did not. Because of the low sample size and nature of the distribution, the data were not considered to be suitable for parametric analyses. Therefore, non-parametric tests were used, with group and trial block as between- and within-subjects independent measures, respectively and relative response ratios in the initial-links phase as the dependent measure. Owing to the low sample size, significance values were calculated using the ‘Exact Test’: this is more robust for small sample sizes compared to the default Asymptotic Method, and it yields a lower probability of making a Type-II error [21]. Freidman tests for multiple related samples were carried out for each of the groups. For the non-crib-biters, their responses for the initiallink related to the shorter terminal-link immediacy ratio increased across sessions, 2 (N = 4) = 6.5, Exact P < 0.05. However, for the cribbiters, there was no significant change in response allocation across trials, 2 (N = 4) = 0.5, Exact P > 0.05. Wilcoxon rank-sum tests were carried out between groups in all sessions. There was no significant group difference for sessions 1 or 2 (Exact Ps > 0.05), however, noncrib-biters responded more to the initial-link associated with the shorter terminal-link immediacy ratio in Session 3, WS (4, 4) = 10, P < 0.05. The aim of the present study was to compare the performance of crib-biters and non-crib-biters under concurrent chain schedules. Given previously identified neurophysiological differences [16], it was anticipated that crib-biters would respond differently within this learning paradigm. Specifically, it was predicted that crib-biters with a reduction in DA receptors in the DMS would be unable to maintain R–O learning in a continuously applied learning paradigm. Results demonstrated that this was indeed the case with the four crib-biters in the present study apparently unable to differentiate between FI-10 s and FI-20 s terminal-link schedules compared to control animals. This is the first study to examine the behavioural correlates of the equine stereotypy phenotype using a free-operant choice paradigm and extends previous work examining the stereotypy phenotype under simple extinction paradigms [7,8,11]. The nature of responding within the context of concurrent chain schedules is such that it requires the use continuous use of R–O learning in order to learn the contingencies successfully. It seems that animals with endogenous dysregulation of the ventral striatum and DMS may find this problematic. Concurrent chain schedules performance has also recently been examined in amphetamine-sensitised pigeons [24]. Specifically, pigeons administered with d-amphetamine were found to have a dose-dependent, reduced sensitivity to delay of reinforcement, as
139
illustrated by indifference in choices between delayed and more immediate terminal-link schedules. It may be that in the present sample, putative neurophysiological alterations in the ventral striatum of crib-biting horses lead to a decrease in delay sensitivity within the context of the three sessions. Research utilising post-training devaluation in rodents has shown that DA infusion into the striatum results in an accelerated dorsal shift from R–O (ventral striatum/ DMS) to S–R (DLS) learning [1]. Amphetamine-sensitised rats have also been found to persist to respond in order to access a devalued substrate after relatively few training sessions—an effect typically associated with overtraining [1,18]. This suggested that the sensitised (amphetamine) group shifted faster from R–O to S–R strategy [18]. It may be that, given the present results and in relation to the aforementioned studies, facilitated synaptic transmission within the ventral striatum as well as decreased transmission in the DMS (as proposed in the equine stereotypy phenotype [16]) mediate an accelerated shift from R–O to S–R-based processes in crib-biting horses, compared with controls, within the context of the three sessions. Furthermore, enhancement of habit formation, owing to chronic or prolonged DA agonist exposure causes a decrease in dendritic spines of DA receptor cells in the DMS, and a significant increase in the DLS [13]. The former but not the latter (as inferred from D1-like and D2-like receptor Bmax values) has been observed in the Equine Oral Stereotypy Phenotype [16] and may reflect differences in environmental-, as opposed to psychostimulant-induced neurophysiological changes. Stereotypic behaviour is characterised by its habitual nature [14], and it is likely that enhancement of habit formation may be a plausible explanation for our results. Future research may examine this more specifically, perhaps in the context of devaluation or contingency manipulation (see Refs. [18,27]). Finally, from a functional neuroanatomical perspective, this study illustrates the potential benefit of assessing learning within the context of the Stereotypy Phenotype. Specifically, this endogenously produced, neurophysiologically aberrant phenotype (as described) maybe a very useful neuroanatomical tool for complimenting existing (PET, lesion, antagonist, and agonist) research as to the role of different basal ganglia regions during the learning processes, and it may help clarify some of the functional heterogeneity issues of the striatum currently being discussed [25]. References [1] Balleine BW, Dickinson A. Goal-directed instrumental action: contingency and incentive learning and their cortical substrates. Neuropharmacology 1998;37:407–19. [2] Cabib S, Bonaventura N. Parallel strain-dependent susceptibility to environmentally-induced stereotypies and stress-induced behavioral sensitization in mice. Physiol Behav 1997;61:499–506. [3] Cabib S, Giardino L, Calza L, Zanni M, Mele A, Puglisi-Allegra S. Stress promotes major changes in dopamine receptor densities within the mesoaccumbens and nigrostriatal systems. Neuroscience 1998;84:193–200. [4] Cardinal RN, Pennicott DR, Sugathapala CL, Robbins TW, Everitt BJ. Impulsive choice induced in rats by lesions of the nucleus accumbens core. Science 2001;292:2499–501. [5] Cooper SJ, Dourish CT. Neurobiology of stereotyped behaviour. Oxford: Oxford Science Publications; 1990, 297 pp. [6] Fantino E. Choice and rate of reinforcement. J Exp Anal Behav 1969;12:723–30. [7] Garner JP, Mason GJ. Evidence for a relationship between cage stereotypies and behavioural disinhibition in laboratory rodents. Behav Brain Res 2002;136:83–92. [8] Garner JP, Meehan CL, Mench JA. Stereotypies in caged parrots, schizophrenia and autism: evidence for a common mechanism. Behav Brain Res 2003;145:125–34. [9] Gerdeman GL, Partridge JG, Lupica CR, Lovinger DM. It could be habit forming: drugs of abuse and striatal synaptic plasticity. Trends Neurosci 2003;26:184–92. [10] Goodwin D, Davidson HP, Harris P. Foraging enrichment for stabled horses: effects on behaviour and selection. Equine Vet J 2002;34: 686–91.
140
M. Parker et al. / Behavioural Brain Research 191 (2008) 137–140
[11] Hemmings A, McBride SD, Hale CE. Perseverative responding and the aetiology of equine oral stereotypy. Appl Anim Behav Sci 2007;104:143–50. [12] Hill J. Impacts of nutritional technology on feeds offered to horses: a review of effects of processing on voluntary intake, digesta characteristics and feed utilization. Anim Feed Sci Technol 2007;138:92–117. [13] Jedynak JP, Uslaner JM, Esteban JA, Robinson TE. Methamphetamine-induced structural plasticity in the dorsal striatum. Eur J Neurosci 2007;25:847–53. [14] Mason GJ, Rushen J. Stereotypic animal behaviour: fundamentals and applications to welfare. Oxford: CABI; 2006, 384 pp. [15] Mazur JE. Hyperbolic value addition and general models of animal choice. Psychol Rev 2001;108:96–112. [16] McBride SD, Hemmings A. Altered mesoaccumbens and nigro-striatal dopamine physiology is associated with stereotypy development in a nonrodent species. Behav Brain Res 2005;159:113–8. [17] McGreevy P. Equine behavior. J Equine Vet Sci 2004;24:397–8. [18] Nelson A, Killcross S. Amphetamine exposure enhances habit formation. J Neurosci 2006;26:3805–12. [19] Partridge JG, Tang KC, Lovinger DM. Regional and postnatal heterogeneity of activity-dependent long-term changes in synaptic efficacy in the dorsal striatum. J Neurophysiol 2000;84:1422–9. [20] Saka E, Goodrich C, Harlan P, Madras BK, Graybiel AM. Repetitive behaviors in monkeys are linked to specific striatal activation patterns. J Neurosci 2004;24:7557–65.
[21] Siegel S, Castellan NJ. Nonparametric statistics for the behavioral sciences. Singapore: McGraw-Hill, Inc.; 1988, 399 pp. [22] Thorne JB, Goodwin D, Kennedy MJ, Davidson HPB, Harris P. Foraging enrichment for stabled horses: practicality and effects on behaviour. Appl Anim Behav Sci 2005;94:149–64. [23] Waters AJ, Nicol CJ, French NP. Factors influencing the development of stereotypic and redirected behaviours in young horses: findings of a four year prospective epidemiological study. Equine Vet J 2002;34:572–9. [24] Wei-Min T, Pitts RC, Hughes CE, McLean AP, Grace RC. Rapid acquisition of preference in concurrent chains: effects of d-amphetamine on sensitivity to reinforcement delay. J Exp Anal Behav 2008;89:71–91. [25] Wickens JR, Budd CS, Hyland BI, Arbthnott GW. Striatal contributions to reward and decision making: making sense of regional variations in a reiterated processing matrix. Ann N Y Acad Sci 2007;1104:192–212. [26] Wiepkema PR. On the significance of ethological criteria for the assessment of animal welfare. In: Schmidt D, editor. Indicators relevant to farm animal welfare. The Hague, The Netherlands: Martinus Nijhoff; 1983. p. 71–9. [27] Yin HH, Knowlton BJ. The role of the basal ganglia in habit formation. Nat Rev Neurosci 2006;7:464–76. [28] Yin HH, Knowlton BJ, Balleine BW. Blockade of NMDA receptors in the dorsomedial striatum prevents action-outcome learning in instrumental conditioning. Eur J Neurosci 2005;22:505–12.