Behavioural Processes 64 (2003) 239–250
Role of the orbital prefrontal cortex in choice between delayed and uncertain reinforcers: a quantitative analysis S. Kheramin a , S. Body a , M.-Y. Ho a,1 , D.N. Velázquez-Martinez a,2 , C.M. Bradshaw a,∗ , E. Szabadi a , J.F.W. Deakin b , I.M. Anderson b a
b
Psychopharmacology Section, Division of Psychiatry, University of Nottingham, Room B109, Medical School, Queen’s Medical Centre, Nottingham NG7 2UH, UK Neuroscience & Psychiatry Unit, School of Psychiatry & Behavioural Sciences, University of Manchester, Manchester, UK Received 23 April 2003; received in revised form 9 June 2003; accepted 11 June 2003
Abstract ‘Inter-temporal choice’ refers to choice between two or more outcomes that differ with respect to their sizes, delays, and/or probabilities of occurrence. According to the multiplicative hyperbolic model of inter-temporal choice, the value of a reinforcer increases as a hyperbolic function of its size, and decreases as a hyperbolic function of its delay and the odds against its occurrence. These functions, each of which contains a single discounting parameter, are assumed to combine multiplicatively to determine the overall value of the reinforcer. The model gives rise to a quantitative methodology for analysing inter-temporal choice, based on a family of linear null equations which describe performance under conditions of indifference, when the values of the reinforcers are assumed to be equal. This approach was used to examine the effect of lesions of the orbital prefrontal cortex (OPFC) on inter-temporal choice in rats. Under halothane anaesthesia, rats received injections of the excitotoxin quinolinate into the OPFC or sham lesions. They were trained to press two levers (A and B) for food-pellet reinforcers in discrete-trials schedules. In free-choice trials, a press on A resulted in delivery of a pellet after a delay dA with a probability P = 0.5; a press on B resulted in delivery of a pellet with a probability P = 1 after a delay dB . dB was increased progressively across successive blocks of six trials in each session, while dA was manipulated systematically across phases of the experiment. The indifference delays, dB(50) (value of dB corresponding to 50% choice of B) was estimated for each rat in each phase. Linear functions of dB(50) versus dA were derived, and the parameters of the function compared between the groups. In both groups, dB(50) increased linearly with dA . The slope of the linear function was significantly steeper in the lesioned group than in the sham-lesioned group, whereas the intercept did not differ significantly between the groups. Analysis based on the relevant null equation indicated that the lesion of the OPFC increased the rate of both delay and odds discounting. Possible implications of the results for interpreting the effects of OPFC lesions on inter-temporal choice behaviour in man are discussed. © 2003 Elsevier B.V. All rights reserved. Keywords: Inter-temporal choice; Delay discounting; Odds discounting; Orbital prefrontal cortex
∗
Corresponding author. Tel.: +44-115-970-9336; fax: +44-115-919-4473. E-mail addresses:
[email protected] (M.-Y. Ho),
[email protected] (D.N. Vel´azquez-Martinez),
[email protected] (C.M. Bradshaw). 1 Present address: Institute of Clinical Behavioral Sciences, Chang Gung University, Kwei Shan, Taiwan 333, ROC. 2 Present address: Depto. Psicofisiologia, Facultad de Psicologia, Universidad Nacional Aut´ onoma de M´exico, M´exico D.F. 04510, M´exico. 0376-6357/$ – see front matter © 2003 Elsevier B.V. All rights reserved. doi:10.1016/S0376-6357(03)00142-6
240
S. Kheramin et al. / Behavioural Processes 64 (2003) 239–250
1. Introduction Animals and humans frequently make choices between reinforcers that differ in terms of their sizes, delays and/or probabilities of occurrence (intertemporal choice). Several quantitative models have been developed to account for inter-temporal choice behaviour (Ainslie, 1975; Herrnstein, 1981; Mazur, 1987, 1997; Rachlin, 1974, 1995; Ho et al., 1997, 1999; Monterosso and Ainslie, 1999). A common feature of most of these models is that the value of a reinforcer is deemed to be a positive function of its size, and an inverse function (generally an inverse hyperbolic function) of its delay and uncertainty. It is assumed that, in an inter-temporal choice situation, the organism selects the reinforcer that has the higher overall value at the moment of choice. The multiplicative hyperbolic model proposed by Ho et al. (1997, 1999) postulates that the value of a reinforcer (V) is determined by the multiplicative combination of hyperbolic discounting functions that modulate the effects of salient features of the reinforcer, including its size (q), delay (d) and uncertainty (θ, the odds against reinforcement, where θ = [1/P] − 1): V =
1 1 1 × × 1 + Q/q 1 + Kd 1 + Hθ
(1)
The discounting parameters Q, K and H define the organism’s sensitivity to reinforcer size, delay and uncertainty, respectively; they are assumed to differ between individuals, and to be sensitive to neurobiological interventions (see Ho et al., 1999). Experimental investigations of inter-temporal choice typically entail confronting an organism with repeated opportunities to choose between two reinforcers, A and B, which differ with respect to some salient feature (for example, size: qA = qB ). All salient features of reinforcer A (qA , dA and θ A ) are held constant throughout the experiment, and one feature of reinforcer B (for example, its delay, dB ) is systematically varied. The resulting changes in preference, which may be expressed as proportional choice of B (% B), are assumed to reflect changes in the value of that reinforcer. Preference in inter-temporal choice situations has proved to be sensitive to a variety of neurobiological interventions. Destruction of the ascending 5-HTergic
pathways has been found to promote preference for smaller, immediate reinforcers over larger delayed reinforcers (Wogar et al., 1993; Mobini et al., 2000a,b). Similar effects have been observed following lesions of the core region of the nucleus accumbens (Cardinal et al., 2001), and the orbital region of the prefrontal cortex (Mobini et al., 2002). Inter-temporal choice behaviour is also sensitive to systemic treatment with psychoactive drugs belonging to various pharmacological classes, including ethanol (Tomie et al., 1998; Evenden and Ryan, 1999), 5-hydroxytryptamine (5-HT)1A receptor agonists and 5-HT2 receptor antagonists (Evenden and Ryan, 1999), 5-HT reuptake inhibitors (Wolff and Leander, 2002), psychostimulants (Logue et al., 1992; Richards et al., 1999; Cardinal et al., 2000), and antipsychotic drugs (Cardinal et al., 2000; Wade et al., 2000). Unfortunately, such experimentally induced changes in preference are often difficult to interpret. The problem is highlighted by Eq. (1), which postulates that the value of a reinforcer is jointly determined by several features whose influences are modulated by separate discounting parameters. Thus, a change in preference induced by a drug or a lesion generally cannot be ascribed with confidence to the alteration of one particular discounting parameter (see Ho et al., 1997, 1999). Ho et al. (1999) advocated the use of null equations based on the indifference relation VA = VB to overcome this problem. For example, applying Eq. (1) to the case of choice between two certain reinforcers (θ = 0), the indifference relation may be expanded thus, qA 1 qB 1 × = × qA + Q 1 + KdA qB + Q 1 + KdB Solving for dB , 1 1/(1 + Q/qB ) − 1/(1 + Q/qA ) dB = K 1/(1 + Q/qA ) 1 + Q/qA + dA 1 + Q/qB
(2)
(3)
which specifies a linear relation between the indifference delay to reinforcer B (dB(50) ) and the delay imposed on reinforcer A, dA (Ho et al., 1997, 1999). It can be seen that Q features in both the slope and the intercept of this relation, whereas K features only in the intercept. A change in the slope of this function is
S. Kheramin et al. / Behavioural Processes 64 (2003) 239–250
thus diagnostic of an alteration of Q, whereas a change in the intercept unaccompanied by a change in slope is diagnostic of an alteration of K. Mobini et al. (2000a) used Eq. (3) to reveal a selective effect of central 5-HT depletion on the rate of delay discounting (K). More recently, Kheramin et al. (2002) used the same equation to analyse the effect of an excitotoxic lesion of the orbital prefrontal cortex (OPFC) on inter-temporal choice (see below). Kheramin et al. (2002) exposed OPFC-lesioned and sham-lesioned rats to repeated choices between small and large reinforcers, A and B. A progressive delay procedure (Evenden and Ryan, 1996) was used to measure indifference delays to B (dB(50) ) for a range of delays to A (dA ). The slope of the fitted linear function was steeper in the lesioned rats than in the sham-lesioned rats, but the intercept did not differ between the two groups, indicating that the lesion altered both the sensitivity to reinforcer size (Q was increased) and rate of delay discounting (K was increased). The experiment reported here extended Kheramin et al.’s (2002) findings to choice between certain and uncertain reinforcers of equal size. Indifference delays to the certain reinforcer were measured for a range of delays to the uncertain reinforcer. The relevant null equation is dB = dA (1 + HθA ) +
HθA K
(4)
(Ho et al., 1999). Changes in sensitivity to reinforcer uncertainty (H) should influence both the slope and the intercept of the linear function relating dB(50) to dA , whereas changes in K should only affect the intercept.
2. Materials and methods 2.1. Subjects Twenty-eight experimentally naive female Wistar rats aged approximately 4 months and weighing 250–290 g at the start of experiment were used. They were housed individually under a constant cycle of 12 h light and 12 h darkness (lights on 06:00–18:00 h), and were maintained at 80% of their initial freefeeding body weights by providing a limited amount of standard rodent diet after each experimental session. Tap water was freely available in the home cage.
241
2.2. Apparatus The rats were trained in operant conditioning chambers (Campden Instruments, Loughborough, UK) of internal dimensions 25 cm × 25 cm × 22 cm. One wall of the chamber contained a recess fitted with a hinged Perspex flap, into which a motor-operated pellet dispenser could deliver 45-mg food pellets. Apertures were situated 2.5 cm above the floor and 2.5 cm on either side of the recess; a motor-driven retractable lever could be inserted into the chamber through each aperture. Each lever could be depressed by a force of approximately 0.2 N. A 2.8-W lamp was mounted on the front panel of chamber, 2.5 cm above each lever; a third lamp was mounted 10 cm above the central recess. The chamber was housed in a sound-attenuating chest; masking noise was provided by a rotary fan. A microcomputer (CeNeS Ltd., Cambridge, UK) programmed in ARCHNID BASIC, and located in an adjoining room, controlled the schedules and recorded the behavioural data. 2.3. Surgery The rats received either lesions of the OPFC (n = 15) or sham lesions (n = 13). Anaesthesia was induced with halothane (4% in oxygen), and the rat positioned in a stereotaxic apparatus (David Kopf), with the upper incisor bar set 3.3 mm below the inter-aural line. Anaesthesia was maintained with 2% halothane in oxygen during surgery. A small hole was drilled in the skull over each hemisphere for microinjection of the neurotoxin into the OPFC (two injections in each hemisphere). The following coordinates (mm, measured from Bregma) were used to locate the OPFC: site (i) AP +3.7, L ±1.2, DV −4.8; site (ii) AP +3.7, L ±2.8, DV −4.4. Injections were given via a 0.3-mm diameter cannula connected by a polythene tube to a 10-l Hamilton syringe. In the case of the lesioned group, the cannula tip was slowly lowered to the position of each site and 0.5 l of a 0.1-M solution of quinolinic acid (2,3-pyridinedicarboxylic acid) in phosphate-buffered 0.9% NaC1 (pH 7.0) was injected at a rate of 0.1 1 per 15 s. The cannula was left in its position for 3 min after completion of the injection in each site. In the case of sham-lesioned group, the procedure was identical, except that the vehicle alone was injected.
242
S. Kheramin et al. / Behavioural Processes 64 (2003) 239–250
2.4. Behavioural training Two weeks after surgery, the food deprivation regimen was started, and the rats were gradually reduced to 80% of their free-feeding body weights. They were then trained to press the lever for food-pellet reinforcers, and were exposed to a discrete-trials continuous reinforcement schedule, in which the two levers were presented in random sequence, for three sessions. Then they underwent daily training sessions under a discrete-trials schedule, as described below. With the exception of Phases 4 and 5 (see below), each session consisted of six blocks of six trials; the trials were initiated at 97.5-s intervals, irrespective of the rat’s behaviour. Each block consisted of four forced-choice trials in which a single lever was presented (lever A in two trials, and lever B in two trials, in random sequence), and two free-choice trials in which both levers were presented. The onset of each trial was signalled by illumination of the central light above the tray; 2.5 s later the lever(s) were inserted into the chamber. If a lever-press occurred, the lever was withdrawn (in free-choice trials, both levers were withdrawn), the central light was extinguished, and the light situated above the lever that had been pressed was illuminated. This light remained illuminated for the duration of the scheduled delay period (see below), and was then extinguished. If no lever-press occurred within 5 s of the lever(s) being inserted, the central light was extinguished, and the lever(s) were withdrawn (this seldom happened except during the first few training sessions). A response on lever A initiated a delay dA , at the end of which a single food pellet was delivered with a probability P = 0.5 (θ = 1). A response on lever B initiated a delay dB , at the end of which a single food pellet was invariably delivered (P = 1; θ = 0). The positions of levers A
and B (left versus right) were counter-balanced across subjects. The delays dA and dB , were manipulated as follows. dA was held constant throughout each phase of the experiment, and was varied between phases (see below). dB was varied systematically across the six blocks of each session. In the first block dB was set equal to dA ; in subsequent blocks, dB was increased in increments of 75%. The values of dA used in the six phases were 1, 2, 4, 8, 12 and 0.5 s. In the case of Phases 4 and 5, when dA was 8 and 12 s, respectively, the application of six increments of 75% would have resulted in dB being longer than the trial duration in the sixth block; therefore there were only five blocks of trials in these phases and the trial length was increased to 120 s. The first phase continued for 80 sessions, and the remaining phases for 40 sessions. Table 1 summarizes the delay conditions in each phase. Experimental sessions took place 7 days a week, at the same time each day, during the light phase of the daily cycle (between 07:00 and 14:00 h). 2.5. Histology At the end of behavioural experiment, the rats were killed by CO2 and their brains were removed and fixed in 10% formol saline for 1 week. The brains were sectioned using a freezing microtome. Coronal sections (60 m) taken through the prefrontal region were mounted on gelatine coated slides. The selected sections were dried in formaldehyde vapour, stained in 0.25% cresyl violet for 10 min at room temperature, dehydrated by successive immersion in 95% ethanol, 100% ethanol and xylene, and mounted with DPX. An investigator who was blind to the behavioural results performed the microscopic examination. Drawings of the extent of the lesions were superimposed on the
Table 1 Summary of the experimental conditions Phase of experiment
dA (s)
1 2 3 4 5 6
1 2 4 8 12 0.5
dB (s) Block 1
Block 2
Block 3
Block 4
Block 5
Block 6
1.00 2.00 4.00 8.00 12.00 0.50
1.75 3.50 7.00 14.00 21.00 0.88
3.06 6.13 12.25 24.50 36.75 1.53
5.36 10.72 21.44 42.88 64.31 2.68
9.38 18.76 37.52 75.03 112.55 4.69
16.41 32.83 65.65 – – 8.21
Number of sessions 80 40 40 40 40 60
S. Kheramin et al. / Behavioural Processes 64 (2003) 239–250
appropriate pages of the stereotaxic atlas of Paxinos and Watson (1998). 2.6. Analysis of data The percentage of free-choice trials in which lever B was pressed (% B) was derived for each block of trials from the pooled data from the last 20 sessions of
243
each phase. The indifference delay (dB(50) : the value of dB corresponding to % B = 50%) was calculated for each rat by linear interpolation between the two delays which fell on either side of % B = 50%. The indifference delays were subjected to two-factor analysis of variance (group × phase) with repeated measures on the second factor, followed by between-group comparisons in each phase of the experiment using the least
Fig. 1. Percent responding on lever B (% B) as a function of the delay to the larger of two reinforcers following a response on lever B (dB ); points are group mean data. Each plot shows the data obtained from one phase of the experiment, in which the delay to the smaller reinforcer (dA ) was fixed at the value shown in inset panel. The horizontal line signifies indifference (% B = 50).
244
S. Kheramin et al. / Behavioural Processes 64 (2003) 239–250
significant difference test, corrected for multiple comparisons, with a significance criterion of P < 0.05. Plots of the indifference delay (dB(50) ) versus dA were derived for each rat, and linear functions were fitted by the method of least squares. The slope and intercept of the linear function were compared between the two groups using Student’s t-test. In order to compare the stability of choice behaviour between the two groups, values of dB(50) derived in successive 10-session blocks of each phase (excluding the first four blocks of Phase 1) were subjected to a three-factor analysis of variance (group × phase × session-block) with repeated measures on the second and third factors.
3. Results 3.1. Behavioural data Fig. 1 shows the group mean percentages of free-choice trials in which lever B was pressed (% B) in each block of trials in the last 20 sessions of each
phase of the experiment. In each phase, preference for lever B declined monotonically as a function of increasing delays to reinforcement following a response on that lever (dB ). Visual inspection suggests that in the case of higher values of dA (Phases 3, 4 and 5), the points of intersection of the preference functions with the horizontal line corresponding to 50% choice of lever B was displaced to the right in the OPFC-lesioned group, compared to the sham-lesioned group. This trend was analysed quantitatively by comparing the intersection points (dB(50) ) derived from the individual rats in the two groups (see below). Fig. 2 compares the time-course of changes in dB(50) in the two groups across the six phases of the experiment. For the purposes of this comparison, dB(50) values were derived from successive blocks of 10 sessions. Analysis of variance of these data revealed a significant main effect of phase (F(5, 130) = 53.8, P < 0.001) and a significant phase × group interaction (F(5, 130) = 2.3, P < 0.05); the main effect of session-block was not significant (F < 1), but there was a significant phase × session-block interaction
Fig. 2. Time-course of changes in the indifference point (dB(50) ) across successive phases of the experiment. Points indicate group mean values of dB(50) in successive blocks of 10 sessions (open circles: sham-lesioned group; filled circles: OPFC-lesioned group); vertical lines indicate transitions from one phase to the next, when the delay to the uncertain reinforcer (dA ) was changed.
S. Kheramin et al. / Behavioural Processes 64 (2003) 239–250
245
60 OPFC LESION intercept = 4.909 slope = 3.638 = 0.981 r2
50
40
30
20 SHAM LESION intercept = 4.523 slope = 2.053 = 0.960 r2
10
0 0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Fig. 3. Linear indifference functions derived for the sham-lesioned group (open circles) and the OPFC-lesioned group (filled circles). Ordinate: indifference delay to the certain reinforcer (dB(50) ); abscissa: imposed delay to the uncertain reinforcer (dA ). Points are group mean data (±S.E.M.); lines are best-fit linear functions.
5
7
*
6
INTERCEPT, s
SLOPE
4
3
2
5 4 3 2
1 1 0
0
SHAM LESION
OPFC LESION
SHAM LESION
OPFC LESION
Fig. 4. Parameters of linear indifference functions obtained from the individual rats in the sham-lesioned group (open columns) and OPFC-lesioned group (filled columns). Left-hand panel: slope; right-hand panel: intercept(s). Columns show group mean data; vertical bars indicate S.E.M. Significance of difference between groups: ∗ P < 0.05 (see text for details).
246
S. Kheramin et al. / Behavioural Processes 64 (2003) 239–250
(F(15, 420) = 1.8, P < 0.05). There were no significant interactions between group and session-block (group×session-block; group×phase×session-block: F < 1). Fig. 3 shows the group mean values of dB(50) (±S.E.M.), derived from the last 20 sessions of each phase, plotted against dA . In both groups, dB(50) increased linearly with increasing values of dA (r 2 > 0.96 in each group). The slope of the linear function was steeper in the lesioned group than in the sham-lesioned group. Analysis of variance revealed a significant main effect of phase (F(5, 130) = 49.5, P < 0.001) and a significant phase×group interaction (F(5, 130) = 3.4, P < 0.01); the main effect of group was not significant (F(1, 28) = 3.0, 0.1 > P > 0.05). Post hoc comparisons (least significant difference test) revealed that the value of dB(50) was significantly greater in the lesioned group than in the sham-lesioned group when dA was 8 and 12 s (Phases 4 and 5).
Fig. 4 shows the mean (±S.E.M.) values of slope and intercept of the linear functions, derived for the individual rats in each group. The slope of function was significantly greater in the lesioned group than in the sham-lesioned group (t(26) = 2.3; P < 0.05); the intercept did not differ significantly between the two groups (t(26) < 1). 3.2. Histology Bilateral lesions were found to be accurately placed in all of 15 rats that had received injections of quinolinic acid. These rats showed extensive neuronal loss and gliosis of the OPFC. In every rat in the lesioned group, the lesion included the ventral and lateral orbital regions (Paxinos and Watson, 1998); in some rats, the lesion extended to the midline, affecting the medial orbital and infralimbic cortical regions. In all cases, the dorsal and lateral cortical surfaces were spared.
Fig. 5. Diagram of the approximate area of destruction of the OPFC in the lesioned group. Drawings were made from the microscopic sections, and were superimposed on the relevant pages from Paxinos and Watson’s (1998) stereotaxic atlas. The black area represents the smallest, and the stippled area the largest extent of the lesion; the hatched area indicates a “representative” lesion.
S. Kheramin et al. / Behavioural Processes 64 (2003) 239–250
The lesion extended rostrally to AP +4.2 in all animals. The caudal boundary was between AP +3.5 and AP +3.2 in most cases, and in no case did the lesion appear to invade the nucleus accumbens at AP +2.7 (Paxinos and Watson, 1998). The extent of the lesion is summarised in Fig. 5.
4. Discussion Injection of the excitotoxin quinolinate produced a substantial lesion of the OPFC, similar in size and extent to that found in our previous experiments (Mobini et al., 2002; Kheramin et al., 2002). According to the regional classification recommended by Uylings and van Eden (1990) and Kesner (2000), the lesion embraced the ventral and lateral orbital prefrontal cortex, with some involvement of the medial prefrontal cortex (medial orbital and infralimbic regions). The discrete-trials schedule used in this experiment was adapted from the progressive delay schedule devised by Evenden and Ryan (1996). Our procedure differed from that described by Evenden and Ryan (1996) only the increments in dB that were imposed in successive blocks of trials within a session. Unlike most previous experiments employing this schedule (Evenden and Ryan, 1996, 1999; Cardinal et al., 2000, 2001), we used the geometric progression, dB = dA (1.75)n−1 , where n is the ordinal position of the block within the session. This offered the advantage of allowing the range of values of dB to be adapted to the value of dA , which, in our experiment, was manipulated systematically across phases of the experiment (see also Kheramin et al., 2002). In keeping with previous experience with this schedule (Evenden and Ryan, 1996, 1999; Cardinal et al., 2000, 2001; Kheramin et al., 2002), preference for the larger of the two reinforcers (% B) declined progressively during the session, as the delay to that reinforcer, dB , was progressively increased. In both the lesioned and the sham-lesioned groups, the indifference delay to the larger reinforcer, dB(50) , increased as a linear function of dA , in accordance with the predictions of Eq. (4). Previous applications of linear null equations to inter-temporal choice have generally addressed choice between reinforcers differing in size and delay (Mazur, 1987; Ho et al., 1997; Mobini et al., 2000a; Kheramin et al., 2002).
247
The present results extend their use to choice between reinforcers differing in delay and uncertainty. The notion of hyperbolic discounting of uncertainty (‘odds discounting’), assumed in Eq. (1), is supported by numerous experiments both with animals (Mazur, 1988, 1997; Mobini et al., 2000b, 2002) and with humans (Rachlin et al., 1991; Kirby and Marakovi´c, 1996; Mitchell, 1999; Richards et al., 1999). The identity of form of the delay and odds discounting functions has led some authors to argue that delay and uncertainty may be functionally equivalent in determining inter-temporal choice (Rachlin et al., 1986; Rachlin and Raineri, 1992; Myerson and Green, 1995; Green and Myerson, 1996). In the present experiment, destruction of the OPFC led to similar changes of the delay and odds discounting parameters (see below), which might be perceived as evidence supporting a close relation between the two types of discounting. However, it should be noted that several lines of evidence support a functional distinction between delay and odds discounting. Firstly, delay and odds discounting are unequally affected by lesions of the 5-HTergic pathways in rats (Mobini et al., 2000b). Secondly, the rates of delay and odds discounting tend to be negatively correlated in human subjects (Richards et al., 1999). Thirdly, between-group differences in delay discounting in humans (smokers versus non-smokers) are not mirrored by similar differences in odds discounting (Mitchell, 1999). Finally, there is evidence that delay and odds discounting in humans may be unequally sensitive to monetary inflation (Ostaszewski et al., 1998). Comparison of the linear functions derived for the two groups revealed that the OPFC-lesioned group’s function had a significantly steeper slope than that of the sham-lesioned group; the intercepts did not differ significantly between the two groups. According to Eq. (4), the steeper slope shown by the lesioned group implies a higher rate of odds discounting in that group compared to the sham-lesioned group. This is consistent with a previous finding by Mobini et al. (2002) that OPFC-lesioned rats tended to be more risk-averse than normal rats when making choices between small certain and large uncertain reinforcers. However, Mobini et al.’s (2002) experiment did not entail fitting a null equation to multiple indifference points, and the authors were therefore unable to establish whether the stronger risk aversion shown by the OPFC-lesioned
248
S. Kheramin et al. / Behavioural Processes 64 (2003) 239–250
rats reflected a greater sensitivity to uncertainty (i.e. higher value of H) or whether the two groups differed only in their sensitivity to reinforcer size (i.e. a difference in Q). Kheramin et al. (2002) found that lesions of the OPFC resulted in increased sensitivity to the ratio of the sizes of two reinforcers, consistent with an increase in the value of Q. However, although this effect may have contributed to Mobini et al.’s (2002) findings, the present results are not subject to this explanation, firstly because the two reinforcers used in this experiment were of equal sizes, and secondly because, according to Eq. (4), an effect of the lesion on the slope of the linear function necessarily implies a change in the value of H. Inspection of Eq. (4) shows that the odds-discounting parameter H influences both the slope and the intercept of the linear indifference function: an increase in the value of H should not only steepen the slope of the function, but should, other things being equal, raise the intercept. The fact that this did not occur in the OPFC-lesioned rats in this experiment indicates that in addition to increasing the value of H, the lesion also increased the value of K, since H and K exert opposing effects on the intercept. This conclusion is in agreement with Kheramin et al.’s (2002) findings, which indicated that the OPFC lesion increased the value of K when rats chose between certain reinforcers that differed with respect to size and delay. Lesions of the prefrontal cortex have been found to impair instrumental learning in some situations (e.g. Schoenbaum et al., 1998; Morgan and LeDoux, 1999). This raises the possibility that the OPFC-lesioned group in the present experiment may have been slower than the sham-lesioned group to adapt to the change in contingencies at the start of each phase, and that this might have contributed to the different slopes of the indifference functions shown by the lesioned group. The data shown in Fig. 2 suggest that this is unlikely to have been the case, since the changes in dB(50) across successive blocks of sessions within each phase did not differ significantly between the two groups. Another possibility is that the OPFC-lesioned rats may have been slower to perceive the change in dB across successive blocks of trials within each session. We think that this is also unlikely, since the between-group difference in dB(50) did not occur in every phase, but only in Phases 3, 4 and 5, when longer values of dA were employed.
As in many previous investigations of inter-temporal choice (Mazur, 1987, 1997; Gibbon et al., 1988; Grace, 1994; Richards et al., 1997, 1999; Mobini et al., 2000a, 2000b, 2002; Kheramin et al., 2002), the present experiment employed explicit discriminative stimuli during the pre-reinforcer delays (illumination of a lamp above the lever that had been depressed). It is likely that these intra-delay stimuli acquired conditioning reinforcing properties (Mazur, 1995, 1997). Indeed, Mazur (1997) has proposed that preference in delayed reinforcement schedules reflects the relative conditioned reinforcing efficacy of stimuli present during the pre-reinforcer delay. Mazur’s (1997) suggestion has particular relevance for the role of the OPFC in inter-temporal choice, because there is evidence that lesions of the OPFC can disrupt performance maintained by conditioned reinforcers (Otto and Eichenbaum, 1992; Pears et al., 2001). Further experiments are needed to determine whether diminished value of the intra-delay signals may account for the increased rate of delay and/or odds discounting seen in OPFC lesioned rats (see also Mobini et al., 2002; Kheramin et al., 2002). The present results have some unexpected implications for the role of the OPFC in inter-temporal choice behaviour. Fig. 3 shows that the indifference delays obtained from the lesioned group were significantly longer than those of the sham-lesioned group under conditions of longer delays to reinforcer A. The indifference delay is often taken as an index of tolerance of delay of reinforcement (see Monterosso and Ainslie, 1999). On this basis, it would seem that the OPFC lesion rendered the rats more tolerant of delay than the sham-lesioned group, a conclusion that appears paradoxical in the light of the extensive literature on the clinical effects of OPFC lesions (see Lishman, 1998) and experimental evidence indicating that such lesions may impair decision making based on the long-term consequences of voluntary action (Bechara et al., 2000). From the point of view of the multiplicative hyperbolic model, the paradox is explained by the combined effect of the lesion on K and H, increases in the values of the two discounting parameters tending to work in opposition when, as in the present experiment, delay and uncertainty are pitted against one another. To conclude, inter-temporal choice entails a trade-off between two or more features of reinforcers,
S. Kheramin et al. / Behavioural Processes 64 (2003) 239–250
for example, size versus delay, or delay versus uncertainty. Proportional preference and indifference points therefore reflect the operation of more than one discounting function. A biological intervention that alters more than one of these functions may exert opposite influences on choice (for example, either promoting or restraining ‘impulsive’ choice), depending on the particular values of each feature of each reinforcer. Single indifference points do not provide an adequate basis for inferring a change in a particular discounting function; however, such inferences are feasible when null equations are applied to the data from parametric experiments. The results of the parametric experiment reported here suggest that the OPFC contributes to the regulation of both delay and uncertainty discounting.
Acknowledgements This work was supported by a grant from the Wellcome Trust to CMB and ES (University of Nottingham), and JFWD and IMA (University of Manchester). DNV-M was supported by grants from CONACYT (#37066-H) and Universisad Nacional Autónoma de México DGAPA (#229998). We are grateful to Ms. Victoria Pincott and Mr. R.W. Langley for skilled technical help.
References Ainslie, G.W., 1975. Specious reward: a behavioral theory of impulsiveness and impulse control. Psychol. Bull. 82, 463–492. Bechara, A., Damasio, H., Damasio, A.R., 2000. Emotion, decision making and the orbitofrontal cortex. Cereb. Cortex 10, 295–307. Cardinal, R.N., Robbins, T.W., Everitt, B.J., 2000. The effects of d-amphetamine, chlordiazepoxide, ␣-flupenthixol and behavioural manipulations on choice of signalled and unsignalled delay of reinforcement in rats. Psychopharmacology 152, 362–375. Cardinal, R.N., Pennicott, D.R., Sugathapala, C.L., Robbins, T.W., Everitt, B.J., 2001. Impulsive choice induced in rats by lesions of the nucleus accumbens core. Science 292, 2499–2501. Evenden, J.L., Ryan, C.N., 1996. The pharmacology of impulsive behaviour in rats: the effects of drugs on response choice with varying delays of reinforcement. Psychopharmacology 128, 161–170. Evenden, J.L., Ryan, C.N., 1999. The pharmacology of impulsive behaviour in rats. VI. The effects of ethanol and sedative serotonergic drugs on response choice with varying delays of reinforcement. Psychopharmacology 146, 413–421.
249
Gibbon, J., Church, R.M., Fairhurst, S., Kacelnik, A., 1988. Scalar expectancy and choice between delayed rewards. Psychol. Rev. 95, 102–114. Grace, R.C., 1994. A contextual model of concurrent chains choice. J. Exp. Anal. Behav. 61, 113–129. Green, L., Myerson, J., 1996. Exponential versus hyperbolic discounting of delayed outcomes: risk and waiting times. Am. Zool. 36, 496–505. Herrnstein, R.J., 1981. Self-control as response strength. In: Bradshaw, C.M., Szabadi, E., Lowe, C.F. (Eds.), Quantification of Steady-State Operant Behaviour. Elsevier, Amsterdam, pp. 3–20. Ho, M.-Y., Bradshaw, C.M., Szabadi, E., 1997. Choice between delayed reinforcers: interaction between delay and deprivation level. Q. J. Exp. Psychol. 50B, 193–202. Ho, M.-Y., Mobini, S., Chiang, T.-J., Bradshaw, C.M., Szabadi, E., 1999. Theory and method in the quantitative analysis of “impulsive choice” behaviour: implications for psychopharmacology. Psychopharmacology 146, 362–372. Kesner, R.P., 2000. Subregional analysis of mnemonic functions of the prefrontal cortex in the rat. Psychobiology 8, 219–228. Kheramin, S., Body, S., Ho, M.-Y., Velazquez-Martinez, D.N., Bradshaw, C.M., Szabadi, E., Deakin, J.F.W., Anderson, I.M., 2002. Effects of quinolinic acid-induced lesions of the orbital prefrontal cortex on inter-temporal choice: a quantitative analysis. Psychopharmacology 165, 9–17. Kirby, K.N., Marakovi´c, A., 1996. Delay-discounting probabilistic rewards: rates decrease as amounts increase. Psychon. Bull. Rev. 3, 100–104. Lishman, W.A., 1998. Organic Psychiatry, 3rd ed. Blackwell Science, Oxford. Logue, A.W., Tobin, H., Chelonis, J.J., Wang, R.Y., Geary, N., Schachter, S., 1992. Cocaine decrease self-control in rats: a preliminary report. Psychopharmacology 109, 245–247. Mazur, J.E., 1987. An adjusting procedure for studying delayed reinforcement. In: Commons, M.L., Mazur, J.E., Nevin, J.A., Rachlin, H.C. (Eds.), Quantitative Analyses of Behavior: The Effect of Delay and Intervening Events on Reinforcement Value, vol. 5. Erlbaum, Hillsdale, NJ, pp. 55–73. Mazur, J.E., 1988. Choice between small certain and large uncertain reinforcers. Anim. Learn. Behav. 16, 199–205. Mazur, J.E. 1995. Conditioned reinforcement and choice with delayed and uncertain primary reinforcers. J. Exp. Anal. Behav. 63, 139–150. Mazur, J.E., 1997. Choice, delay, probability, and conditioned reinforcement. Anim. Learn. Behav. 25, 131–147. Mitchell, S.H., 1999. Measures of impulsivity in cigarette smokers and non-smokers. Psychopharmacology 146, 455–464. Mobini, S., Chiang, T.-J., Al-Ruwaitea, A.S.A., Ho, M.-Y., Bradshaw, C.M., Szabadi, E., 2000a. Effects of central 5-hydroxytryptamine depletion on inter-temporal choice: a quantitative analysis. Psychopharmacology 149, 313–318. Mobini, S., Chiang, T.-J., Ho, M.-Y., Bradshaw, C.M., Szabadi, E., 2000b. Effects of central 5-hydroxytryptamine depletion on sensitivity to delayed and probabilistic reinforcement. Psychopharmacology 152, 390–397. Mobini, S., Body, S., Ho, M.Y., Bradshaw, C.M., Szabadi, E., Deakin, J.F.W., Anderson, I.M., 2002. Effects of lesions of the
250
S. Kheramin et al. / Behavioural Processes 64 (2003) 239–250
orbitofrontal cortex on sensitivity to delayed and probabilistic reinforcement. Psychopharmacology 160, 290–298. Monterosso, J., Ainslie, G.W., 1999. Beyond discounting: possible experimental models of impulse control. Psychopharmacology 146, 339–347. Morgan, M.A., LeDoux, J.E., 1999. Contribution of ventrolateral prefrontal cortex to the acquisition and extinction of conditioned fear in rats. Neurobiol. Learn. Mem. 72, 244–251. Myerson, J., Green, L., 1995. Discounting of delayed rewards: models of individual choice. J. Exp. Anal. Behav. 64, 263–276. Ostaszewski, P., Green, L., Myerson, J., 1998. Effects of inflation on the subjective value of delayed and probabilistic rewards. Psychonom. Bull. Rev. 5, 324–333. Otto, T., Eichenbaum, H., 1992. Complementary roles of the orbital prefrontal cortex and perirhinal-entorhinal cortices in an odor-guided delayed-nonmatching-to-sample task. Behav. Neurosci. 106, 762–775. Paxinos, G., Watson, C., 1998. The Rat Brain in Stereotaxic Coordinates. Academic Press, New York. Pears, A., Parkinson, J.A., Everitt, B.J., Roberts, A.C., 2001. Effects of orbitofrontal cortex lesions on responding with conditioned reinforcement. Brain Cogn. 47, 44–46. Rachlin, H., 1974. Self-control. Behaviorism 2, 94–107. Rachlin, H., 1995. Self-control: beyond commitment. Behav. Brain Sci. 18, 109–159. Rachlin, H., Raineri, A., 1992. Irrationality, impulsiveness and selfishness as discount reversal effects. In: Loewenstein, G., Elster, J. (Eds.), Choice Over Time. Russell Sage, New York. Rachlin, H., Logue, A.W., Gibbon, J., Frankel, M., 1986. Cognition and behavior in studies of choice. Psychol. Rev. 93, 33–45. Rachlin, H., Raineri, A., Cross, D., 1991. Subjective probability and delay. J. Exp. Anal. Behav. 55, 233–244.
Richards, J.B., Mitchell, S.H., de Wit, H., Seiden, L.S., 1997. Determination of discount Functions in rats with an adjusting amount procedure. J. Exp. Anal. Behav. 67, 353–366. Richards, J.B., Sabol, K.E., de Wit, H., 1999. Effects of methamphetamine an the adjusting amount procedure: a model of impulsive beavior in rats. Psychopharmacology 146, 432–439. Schoenbaum, G., Chiba, A.A., Gallagher, M., 1998. Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning. Nat. Neurosci. 1, 155–159. Tomie, A., Aguado, A.S., Pohorecky, L.A., Benjamin, D., 1998. Ethanol induces impulsive-like responding in a delay-of-reward operant choice procedure: impulsivity predicts autoshaping. Psychopharmacology 139, 376–382. Uylings, H.B.M., van Eden, C.G., 1990. Qualitative and quantitative comparison of the prefrontal cortex in rat and in primates, including humans. In: Uylings, C.G., van Eden, C.G., De Bruin, J.P.C., Corner, M.A., Feenstra, M.G.P. (Eds.), Progress in Brain Research, vol. 85. Elsevier, Amsterdam, pp. 31–62. Wade, T.R., de Wit, H., Richards, J.B., 2000. Effects of dopaminergic drugs on delayed reward as a measure of impulsive behavior in rats. Psychopharmacology 150, 90– 101. Wogar, M.A., Bradshaw, C.M., Szabadi, E., 1993. Effects of lesions of the ascending 5-hydroxytryptaminergic pathways on choice between delayed reinforcers. Psychopharmacology 113, 239–243. Wolff, M.C., Leander, J.D., 2002. Selective serotonin reuptake inhibitors decrease impulsive behavior as measured by an adjusting delay procedure in the pigeon. Neuropsychopharm. 27, 421–429.