Available online at www.sciencedirect.com
Behavioural Processes 78 (2008) 217–223
Short communication
Rapid acquisition in concurrent chains: Effects of initial-link duration Darren R. Christensen ∗ , Randolph C. Grace ∗ University of Canterbury, Department of Psychology, Private Bag 4800, Christchurch, New Zealand Received 15 September 2007; accepted 9 January 2008
Abstract Pigeons were trained in a concurrent chains procedure in which the terminal-link schedules in each session were either fixed-interval (FI) 10 s FI 20 s or FI 20 s FI 10 s, as determined by a pseudorandom binary series. The initial-link was a variable-interval (VI) 10-s schedule. Training continued until initial-link response allocation stabilized about midway through each session and was sensitive to the terminal-link immediacy ratio in that session. The initial-link schedule was then varied across sessions between VI 0.01 s and VI 30 s according to an ascending and descending sequence. Initial-link response allocation was a bitonic function over the full range of durations. Preference for the FI 10-s terminal-link at first increased as programmed initial-link duration varied from 0.01 to 7.5 s, and then decreased as initial-link duration increased to 30 s. The bitonic function poses a potential challenge for existing models for steady-state choice, such as delay-reduction theory (DRT) [Fantino, E., 1969. Choice and rate of reinforcement. J. Exp. Anal. Behav. 12, 723–730], which predict a monotonic function. However, an extension of Grace and McLean’s [Grace, R.C., McLean, A.P., 2006. Rapid acquisition in concurrent chains: evidence for a decision model. J. Exp. Anal. Behav. 85, 181–202] decision model predicted the bitonic function, and may ultimately provide an integrated account of choice in concurrent chains under both steady-state and dynamic conditions. © 2008 Published by Elsevier B.V. Keywords: Concurrent chains; Initial links; Acquisition; Reinforcer immediacy; Key peck; Pigeons
1. Introduction Most previous studies on concurrent chains have used steadystate procedures in which training continues with a given pair of terminal-link schedules until response allocation in the initial links has stabilized. Typically this requires 20 or more sessions, after which the terminal-link schedules are changed and training begins in a new condition (e.g., Fantino and Davison, 1983; Grace, 2002). However, recent studies have shown that subjects’ response allocation can adjust rapidly when terminallink schedules are changed frequently across sessions (Grace et al., 2003; Grace and McLean, 2006). For example, Kyonka and Grace (2007) exposed pigeons to a concurrent chains procedure in which the terminal-link schedules in each session were either fixed-interval (FI) 10 s FI 20 s or FI 20 s FI 10, determined by a pseudorandom binary series, and the initial-link was a variable-interval (VI) 8-s schedule that arranged equal access to the terminal links. After pigeons had received about 50 ses-
∗
Corresponding authors. E-mail addresses:
[email protected] (D.R. Christensen),
[email protected] (R.C. Grace). 0376-6357/$ – see front matter © 2008 Published by Elsevier B.V. doi:10.1016/j.beproc.2008.01.006
sions of training, response allocation stabilized about midway through each session, showing strong sensitivity to the terminallink delays in the current session with virtually no influence of prior sessions. Grace and McLean (2006) proposed a model to account for performance in such rapid acquisition experiments in which the terminal links are frequently changed. They assumed that a process similar to categorical discrimination underlies the acquisition of choice. According to their model, when reinforcement is obtained in a terminal-link, subjects make a ‘decision’ whether the delay to food in the terminal-link was short or long relative to the history of reinforcer delays experienced in both terminal links. If the delay was short, the response strength for the associated initial-link increases, whereas if the delay was long the response strength for the initial-link decreases. Reinforcement history is represented by a log normal distribution with a mean equal to the log geometric mean of the prior delays associated with both terminal links (the ‘criterion’), and standard deviation σ. This contrasts with theories that assume separate ‘memories’ for delays from each alternative (e.g. Gallistel and Gibbon, 2000; Gibbon et al., 1988). The probability of a ‘short’ decision is determined by calculating the inverse of the cumulative density for reinforcement
218
D.R. Christensen, R.C. Grace / Behavioural Processes 78 (2008) 217–223
Fig. 1. Preference for the shorter (FI 10 s) terminal-link predicted by the extension of Grace and McLean’s (2006) decision model, as the initial-link duration increases from 5 to 30 s. See text for more explanation.
history, with the percentage of values that are greater than the delay giving the probability that the delay is judged ‘short’. Grace and McLean showed that the decision model provided a good account of their data, and was able to predict initial-link responding that approximated generalized matching or categorical, ‘all or none’ preference. When σ was relatively large, log response allocation was a linear function of the log immediacy ratio (generalized matching), whereas log response allocation was a nonlinear function of the log immediacy ratio (categorical preference) when σ was relatively small. One limitation of Grace and McLean’s model is that it does not address effects of initial-link duration on response allocation. Such effects are well known in steady-state research: preference between a constant pair of terminal links becomes less extreme as the initial-link duration is increased (e.g., Fantino, 1969). Models for choice, such as delay-reduction theory (DRT; Fantino and Romanowich, 2007; Squires and Fantino, 1971), the contextual choice model (CCM; Grace, 1994) and the hyperbolic value-added model (HVA; Mazur, 2001) all predict that preference decreases towards indifference as initial-link duration increases. However, relatively little is known about effects of initial-link duration on acquisition of preference (but cf. Berg and Grace, 2006), and no previous experiments using procedures in which terminal-link schedules are changed unpredictably across sessions have manipulated initial-link duration. A simple way to extend Grace and McLean’s (2006) model to include effects of initial-link duration is to view reinforcement history as a series of transitions between stimuli correlated with reward, of which the onset of the terminal links and reinforcer delivery are only two examples. Specifically, we assume that the reinforcement history distribution includes delays between onset of the initial links and terminal-link entry, as well as between terminal-link onset and food delivery. As a result, the criterion (i.e., the mean of the reinforcement history distribution) varies directly with initial-link duration. Fig. 1 shows representative predictions of the decision model for preference between FI 10 s and FI 20 s terminal links, as the initial-link duration
increases from 5 s to 30 s.1 Over most of the range of initial-link durations – between 10 s and 30 s – the predicted preference becomes less extreme as the initial links increase. The reason is that the relative probability of a ‘short’ decision for the terminal links, that is, p(‘short’|10 s)/p(‘short’|20 s), decreases as the initial-link duration increases (for a detailed explanation of the model’s predictions, see Appendix A). As a result, including the time spent in the initial links in the calculation of the criterion allows the decision model to account for the initial-link effect. However, Fig. 1 also shows that for short initial-link durations, there is a range over which preference becomes less extreme as the initial links decrease. This decrease occurs because the relative probability of a ‘short’ decision for the FI 10-s terminal-link decreases as the criterion becomes less than the terminal-link delays. Thus, the overall function relating preference to initial-link duration is bitonic. This bitonic function is robust with respect to variation in the parameter values, and represents a novel prediction of the decision model: existing models for steady-state choice such as CCM (Grace, 1994), HVA (Mazur, 2001), and DRT (Fantino, 1969) predict that choice is a monotonic decreasing function of initial-link duration. The plan of the current study was to train pigeons in a rapidacquisition concurrent chains procedure with a constant pair of terminal-link schedules (FI 10 s and FI 20 s). Once pigeons were showing sensitivity to the immediacy ratio in the current session, we varied the initial-link schedule across sessions, between 0.01 s and 30 s, in ascending and descending series. The major question was whether the stable level of response allocation reached within sessions would vary with initial-link duration, and whether that function would be monotonic decreasing (as predicted by current models for steady-state choice), or a bitonic function as predicted by the extension of the Grace and McLean decision model. 2. Materials and methods 2.1. Subjects Six pigeons of mixed breed, numbered 181–186 served as subjects and were maintained at 85% of their free-feeding weight ± 15 g through appropriate post-session feeding. Subjects were housed individually in a vivarium with a 12 h:12 h light/dark cycle (lights on at 06:00 h), with water and grit freely available in the home cages. All pigeons were experienced with a variety of experimental procedures. 2.2. Apparatus Four standard three-key operant chambers, 32 cm deep × 34 cm wide × 34 cm high, were used. The keys were 21 cm above the floor and arranged in a row. In each chamber a house light was located above the center key that provided general 1 For the predictions in Fig. 1, parameter values for the decision model were σ = 0.15, MaxRS = 1.00, MinRS = 0.01.
D.R. Christensen, R.C. Grace / Behavioural Processes 78 (2008) 217–223
illumination, and a grain magazine had an aperture centered 6 cm above the floor. The magazine was illuminated when wheat was made available. A force of approximately 0.15 N was necessary to operate each key. Each chamber was enclosed in a soundattenuating box, and ventilation and white noise were provided by an attached fan. Experimental events were controlled and data recorded through a microcomputer and MEDPC® interface located in an adjacent room. 2.3. Procedure All pigeons started training immediately in a concurrent chains procedure. The house light provided general illumination at all times except during reinforcer delivery. With few exceptions, sessions were run daily and at approximately the same time (12:00 h). Sessions ended after 72 initial- and terminal-link cycles or 70 min, whichever occurred first. At the start of a cycle, the side keys were illuminated white to signal the initial links. An entry was assigned pseudorandomly to the left or right terminallink with the constraint that in every six cycles, three entries occurred to each terminal-link. An initial-link response produced an entry into a terminal-link provided that: (a) it was made to the pre-selected key; (b) an interval selected from the initial-link schedule had timed out; and (c) a 1-s changeover delay (COD) was satisfied—i.e., at least 1 s had elapsed following a changeover to the side for which terminal-link entry was arranged. The COD was reset prior to each cycle, so that the first response in each cycle was considered a changeover to that alternative. A single VI schedule operated during the initial links. The initial-link VI schedule contained 12 intervals constructed from an exponential progression (Fleshler and Hoffman, 1962). Separate lists of intervals were maintained for cycles in which the left or right terminal-link had been selected, and were sampled without replacement so that all 12 intervals would be used three times for both the left and right terminal links each session. When a terminal-link was entered, the color of the side key was changed (left key to red, right key to green) while the other key was darkened. Terminal-link responses were reinforced according to FI schedules. The terminal-link schedules were always FI 10 s and FI 20 s, but the location of the richer schedule varied across sessions according to a 31-step pseudorandom binary series similar to that used by Hunter and Davison (1985). When a response was reinforced all lights in the chamber were extinguished, the grain magazine raised and illuminated for 3 s, and then the next cycle began. The experiment consisted of two phases, which varied only in terms of the initial-link schedule. In Phase 1, the initial-link schedule was always VI 10 s. The first phase was terminated individually for each pigeon when regression analyses showed that response allocation during the last 20 sessions showed strong sensitivity to the immediacy ratio in the current session (i.e., a0 in Eq. (1) >1.5), and negligible position bias (log b in Eq. (1) <0.10). Training consisted of 150, 193, 92, 193, 78, and 87 sessions for pigeons 181–186, respectively.
219
In Phase 2, the initial-link changed across sessions according to an ascending and descending sequence. The sequence varied from 0.01 to 30 s in 17 equally spaced steps. The schedule values were: 0.01, 1.88, 3.75, 5.63, 7.50, 9.38, 11.25, 13.13, 15.00, 16.90, 18.75, 20.63, 22.50, 24.38, 26.25, 28.13, and 30 s. Because the initial-link schedule had been VI 10 s in Phase 1, training began in the middle of the descending (9.38, 7.5, 5.63, etc.) or ascending (11.25, 13.13, 15.00, etc.) sequence, counterbalanced across birds. When the limit of either sequence was reached, the direction was reversed and the other sequence began with the same values in opposite order (e.g., · · ·24.38, 26.25, 28.13, 30, 28.13, 26.25, 24.38, etc.) For sake of convenience, 0.01 s and 30 s were assigned to the descending and ascending sequences, respectively. Statistical analyses which compared the ascending and descending sequences were based only on those values which were common to both sequences (i.e., 0.01 s and 30 s were excluded). Training in Phase 2 continued until all pigeons had completed the full ascending and descending sequences at least two times each. Because some pigeons completed training in Phase 1 earlier than others, the number of sessions in Phase 2 varied. The total number of sessions completed in Phase 2 was 75, 79, 166, 80, 165, and 167 sessions for Pigeons 181–186, respectively. 3. Results To assess the relationship between response allocation and the immediacy ratios in the current and prior sessions, we used a generalized-matching model: log
B0L 1/D0L 1/D1L = a0 log + a1 log B0R 1/D0R 1/D1R + a2 log
1/D2L + · · · + log b, 1/D2R
(1)
where B and D refer to initial-link response rate and terminallink delay, respectively, subscripted for choice alternative (L and R) and lag (0–4; 0 = current session). The parameters a0 , · · ·, a4 quantify sensitivity to reinforcer immediacy (i.e., reciprocal of delay) at each lag, and log b is a bias parameter. Eq. (1) was applied to the data for individual subjects from the last 30 sessions of Phase 1, and all of the sessions from Phase 2. The upper panel of Fig. 2 shows results when responses during the second half of each session were analyzed with Eq. (1). For all subjects in both phases, lag 0 coefficients were positive and statistically significant, and none of the higher lag coefficients reached significance. A repeated-measures analysis of variance (ANOVA) found a significant effect of lag, F(4, 20) = 110.88, p < .001, but the effects of phase and the lag x phase interaction did not reach significance. Averaged across subjects, lag 0 sensitivity to immediacy was 2.42 (S.E. = 0.26) in Phase 1 and 2.35 (S.E. = 0.20) in Phase 2. This shows that the responding in the second half of the sessions was determined by the immediacy ratio in the current session, with little or no effect of immediacy ratios from previous sessions. To examine the acquisition of preference within sessions, Eq. (1) was applied to data from each of the six blocks of cycles.
220
D.R. Christensen, R.C. Grace / Behavioural Processes 78 (2008) 217–223
Fig. 2. The upper panel shows average sensitivity coefficients for lag 0 through lag 4 log immediacy ratios, for both Phase 1 (filled symbols) and Phase 2 (unfilled symbols). The lower panel shows average lag 0 sensitivity coefficients determined separately for each block of 12 trials within sessions, for both Phase 1 (filled symbols) and Phase 2 (unfilled symbols). The dashed line represents zero sensitivity. Bars indicate ±1 S.E.
The lower panel of Fig. 2 shows the results averaged across subjects. A repeated-measures ANOVA found a significant effect of block, F(5, 25) = 106.14, p < .001, but the effects of phase and the block x phase interaction did not reach significance. Fig. 2 shows that response allocation reached stability approximately halfway through the session, consistent with previous studies using similar rapid-acquisition procedures (Kyonka and Grace, 2007, 2008; Grace et al., 2003). Averaged across subjects, the lag 0 sensitivity to immediacy in the second half of the session was 2.43 [S.E. = 0.26] in Phase 1 and 2.36 [S.E. = 0.20] in Phase 2. These results show that initial-link response allocation during the last 30 sessions of Phase 1 and throughout Phase 2 showed strong sensitivity to the immediacy ratio in the current session, consistent with prior studies. Because the initial-link schedule was changing systematically in Phase 2, this suggests that there was no effect of varying the initial-link schedule from VI 0.01 to VI 30 s on overall sensitivity compared with the VI 10-s schedule used in Phase 1.
Fig. 3. The upper panel shows obtained average initial-link duration plotted as a function of the programmed value, for both ascending (filled symbols) and descending (unfilled symbols) sequences. The bottom panel shows obtained average initial-link response allocation for the FI 10 s terminal links from the second half of the session, plotted as a function of programmed initial-link duration for both ascending (filled symbols) and descending (unfilled symbols) sequences.
The upper panel of Fig. 3 shows the obtained average time spent in the initial links per cycle as a function of programmed initial-link duration in Phase 2, averaged across subjects. Overall, obtained initial-link duration increased linearly with programmed duration: the equation y = 0.87x + 8.36 accounted for over 99% of the variance in the data pooled across both series. A repeated-measures ANOVA confirmed the effect of programmed initial-link value, F(14, 70) = 410.82, p < .001, but found that obtained initial-link durations were greater overall for the ascending than the descending sequence, F(1, 5) = 7.33, p < .05. The interaction was not significant. This shows that obtained initial-link duration increased linearly with programmed duration, and that there was a minimum averaged obtained duration of approximately 8.36 s. This minimum duration occurred because of the dependent scheduling arrangement used for the initial links, which required the pigeon to make responses to the non-preferred alternative. The major goal of the present research was to determine whether response allocation depended on initial-link duration. For this analysis we used data from the second half of each session, when response allocation had stabilized. To analyze data from sessions with FI 10 s FI 20 s and FI 20 s FI 10 s on a com-
D.R. Christensen, R.C. Grace / Behavioural Processes 78 (2008) 217–223
mon scale, we subtracted bias estimates (log b in Eq. (1)) from the log initial-link response ratio for each session, and then took the absolute value. For each pigeon, data were averaged across replications of each programmed initial-link schedule value in both sequences. The bottom panel of Fig. 3 shows the resulting average log response allocation as a function of programmed initiallink duration, for both ascending and descending sequences. Response allocation was a bitonic function of initial-link duration for both sequences. A repeated-measures ANOVA found a significant effect of initial-link duration, F(14, 70) = 16.94, p < .001 and a significant interaction between initial-link duration and sequence, F(14, 70) = 3.49, p < .001. The main effect of sequence was not significant. Planned polynomial contrasts found significant linear, quadratic, and cubic trends for initial-link duration, F(1, 5) = 133.61, p < .001, F(1, 5) = 7.90, p < .05, and F(1, 5) = 36.60, p < .01, respectively. The linear trend represents the classic ‘initial-link effect’: response allocation became less extreme overall as initial-link duration increased. The quadratic and cubic components confirm that the nonmonotonicity of the function in the bottom panel of Fig. 3 was significant. As programmed initial-link duration increased from 0.01 s, response allocation became more extreme at first, but then decreased as initial-link duration continued to increase. The cubic component resulted from the minimum preference during the descending sequence being reached at 24.38 s. The significant interaction between initial-link duration and sequence occurred because for relatively long initial-link durations, response allocation during the ascending sequence tended to be greater than during the descending sequence, whereas the opposite was obtained for relatively short durations. In effect, the bitonic function for the ascending sequence as a whole was shifted to the right compared with the descending sequence. This horizontal displacement suggests that a hysteresis or lag effect was present. For relatively long initial-link durations, the more extreme preferences for the ascending compared to descending sequence reflected the impact of initial-link durations from the preceding sessions, which were relatively shorter for the ascending sequence. Conversely, for relatively short initial-link durations, the less extreme preference evident in the ascending series could have occurred because the initial-link durations from preceding sessions were shorter. Thus, although there was no effect of the immediacy ratio from prior sessions on the sensitivity to reinforcer immediacy, there was an effect of initial-link duration. 4. Discussion Our goal was to explore how response allocation in a rapid acquisition concurrent chains procedure varied when initial-link duration was changed according to an ascending and descending sequence. In particular, we were interested to test whether the relationship between response allocation and initial-link duration was monotonic, as predicted by steady-state models for choice (Fantino, 1969; Grace, 1994; Mazur, 2001), or bitonic, as predicted by an extension of Grace and McLean’s (2006) decision model (see Fig. 1). Results showed that the prefer-
221
ence for the FI 10 s terminal-link increased for programmed initial-link durations in the range of approximately 0.01–7.5 s, and decreased from 7.5 to 30 s (see Fig. 3, lower panel). Thus the relationship between response allocation and initial-link duration was bitonic, consistent with Grace and McLean’s decision model but contrary to existing models for steady-state choice. The bitonic relationship is explained by Grace and McLean’s (2006) decision model as the result of changes in a criterion value. According to their model, preference develops in concurrent chains as subjects make ‘decisions’ after reinforcement is delivered in a terminal-link about whether the preceding delay was short or long, relative to the history of delays that they had previously experienced. The criterion is the mean of the distribution used to compute the probability of a ‘short’ decision, and was calculated as the average of the log terminal-link delays in Grace and McLean’s (2006) original model. Here, we have assumed that the criterion is determined by the intervals between all stimuli correlated with reinforcement. This means that delay between initial-link onset and terminal-link entry contributes to the criterion. Because the terminal-link schedules were constant throughout the experiment, the criterion thus varied directly with initial-link duration. The strength of preference predicted by the decision model is determined by the relative probability of a ‘short’ decision for the terminal links. This ratio is greatest when the criterion is near the geometric mean of the terminal-link delays. Consequently, the decision model attributes the bitonic relationship in Fig. 3 to a reduction in ability to discriminate the shorter terminal-link delay as the criterion becomes short or long relative to the terminal-link midpoint. Although the programmed initial-link duration varied from 0.01 to 30 s, the obtained initial-link durations must determine the criterion and hence preference according to the decision model. However, programmed rather than obtained values were used as the independent variable to test the relationship between preference and initial-link duration, because obtained initiallink duration depends on the subjects’ behavior and thus is problematic as an independent variable. The upper panel of Fig. 3 shows that obtained initial-link duration increased linearly with the programmed duration; however the minimum value was 8.36 s, much longer than the programmed value of 0.01 s. This is a consequence of the scheduling procedure used to equate terminal-link entries over the session: because the initial-link that would result in a terminal-link entry was preselected on each trial, any time that subjects spent responding on the other alternative after the initial-link schedule had elapsed would increase the obtained initial-link duration beyond the programmed value. Moreover, the stronger the preference in a given session, the more likely it was that subjects would be responding on the preferred alternative when a terminal-link was arranged for the other schedule, which could produce an artifactual relationship between response allocation and obtained initial-link duration. However, note that the programmed duration which produced the maximum preference – 7.5 s – was associated with an obtained duration of approximately 15 s (see Fig. 3). Because this is close to the geometric mean of the terminal links (14.14 s), the resulting criterion (also near the geometric mean) would be
222
D.R. Christensen, R.C. Grace / Behavioural Processes 78 (2008) 217–223
expected to generate a near-maximal preference according to the decision model. It is important to note that the downturn in preference at short programmed initial-link durations could not have resulted from a ‘win stay’ strategy in which entry to the richer terminal-link might have been produced by a single response in the initial cycle following a reinforcer in the same terminal-link. Because the COD was reset at the beginning of each cycle, there was a minimum of 1 s between the first response in a cycle and that which produced terminal-link entry. To examine whether there was evidence for sequential dependency in the location of the first response in each cycle, we conducted analyses (not reported here) of the probability of responding to the richer alternative conditional on the location of the previous terminal-link. The probability that the first response in a cycle was made to the richer alternative did not depend on the location of the previous terminal-link, or on the initial-link duration. Because these results are novel and appear to challenge to such well-established models as delay-reduction theory (DRT; Squires and Fantino, 1971), the contextual choice model (CCM; Grace, 1994) and the hyperbolic value-added model (HVA; Mazur, 2001), all of which have substantial empirical support, it is important to test their generality. In particular, future research needs to determine whether similar results would be obtained under steady-state conditions, with independent initial links and other terminal-link schedules. The former manipulation is especially important because independent initial links permit response allocation to go exclusively in favor of one alternative, avoiding possible ceiling effects associated with dependent scheduling. If the bitonic function in Fig. 3 is shown to have generality, then Grace and McLean’s (2006) decision model would have an important advantage over existing models. More broadly, the promise of the decision model is that it may be able to provide an integrated account of choice in concurrent chains under both dynamic and steady-state conditions. Appendix A. Predictions of the extended decision model The basic assumption of the model is that after reinforcement is delivered in a terminal-link, subjects make a ‘decision’ (categorical discrimination) about whether the preceding delay was short or long relative to a criterion. If the delay was short, the strength of responding to the initial-link that led to the terminal-link increases; if the delay was long, response strength decreases. Changes in response strength are made according to a linear-operator rule: RSN+1 = RSN + ps (MaxRS − RSN )Δ − (1 − ps )(RSN − MinRS )Δ,
(A1)
in which RS is response strength (subscripted N + 1 or N to indicate trial number), ps is the probability of the delay being judged short, MaxRS and MinRS are the maximum and minimum response strengths, and Δ is a learning rate parameter. The asymptotic response strength (i.e., after repeated exposure to the same delay) may be obtained by setting Δ = 1 and performing
Fig. A1. Log initial-link response allocation (filled squares, right axis) predicted by the extended decision model as a function of initial-link duration. Data points indicated by ×’s and +’s show the probabilities of a terminal-link delay judged short (ps , left axis) relative to the criterion for the FI 10 and FI 20 schedules, respectively.
some algebraic manipulation: RSasymp = ps MaxRS + (1 − ps )MinRS .
(A2)
Eq. (A2) shows that the predicted asymptotic response strength is a weighted average of MaxRS and MinRS , with the weights given by the probabilities that the delay is judged short or long, respectively. The initial-link response allocation predicted by the model is calculated as the ratio of the response strengths for the two alternatives: RSasympL psL MaxRS + (1 − psL )MinRS BL = = BR RSasympR psR MaxRS + (1 − psR )MinRS
(A3)
where B is initial-link response rate and subscripts L and R indicate the choice alternatives. The model assumes that the preceding delay to terminal-link reinforcement is compared to the history of delays between stimulus transitions in the procedure. These delays include those between initial-link onset and terminal-link entry, as well as those between terminal-link entry and reinforcement. All delays are scaled logarithmically. The history of delays is represented as a normal distribution with a mean equal to the average of all intervals between stimulus transitions, and standard deviation σ. We refer to the mean as the criterion, and σ is a parameter in the model. The probability that the preceding delay is judged as ‘short’ is then calculated as the probability that a randomly selected delay from the distribution is less than the preceding delay: ps = 1 − Φ(log D, log C, σ)
(A4)
where Φ is the cumulative normal distribution with mean log C (criterion) and standard deviation σ, evaluated for the preceding delay (log D) (Fig. A1). Fig. A1 shows how the extended decision model predicts that response allocation is a bitonic function of initial-link duration.
D.R. Christensen, R.C. Grace / Behavioural Processes 78 (2008) 217–223
Displayed are the probabilities that reinforcer delays associated with the FI 10 s and FI 20 s terminal links are judged ‘short’ (ps ) as a function of initial-link duration. For this analysis, MaxRS and MinRS were set equal to 1 and .01, respectively, and σ = .2. However, it is important to note that the bitonic function in Fig. A1 is obtained for a wide range of parameter values. As the initial-link duration decreases from 30 s, the criterion decreases, and hence ps for both terminal links decreases. The predicted response allocation for the FI 10 s alternative becomes more extreme – the well-known ‘initial-link effect’ – because ps decreases relatively more rapidly for the FI 20 s alternative. However, when the initial-link duration is short (∼10 s), ps decreases less rapidly for the FI 20 s alternative, producing a downturn in the predicted log response ratio because ps is now decreasing more rapidly for the FI 10 s alternative. As a result, the extended decision model predicts that response allocation is a bitonic function of initial-link duration. References Berg, M.R., Grace, R.C., 2006. Initial-link duration and acquisition of preference in concurrent chains. Learn. Behav. 34, 50–60. Fantino, E., 1969. Choice and rate of reinforcement. J. Exp. Anal. Behav. 12, 723–730. Fantino, E., Romanowich, P., 2007. The effect of conditioned reinforcement rate on choice: a review. J. Exp. Anal. Behav. 87, 409–421.
223
Fantino, E., Davison, M., 1983. Choice: some quantitative relations. J. Exp. Anal. Behav. 40, 1–13. Fleshler, M., Hoffman, H.S., 1962. A progression for generating variableinterval schedules. J. Exp. Anal. Behav. 5, 529–530. Gallistel, C.R., Gibbon, J., 2000. Time, rate, and conditioning. Psychol. Rev. 107, 289–344. Gibbon, J., Church, R.M., Fairhurst, S., Kacelnik, A., 1988. Scalar expectancy theory and choice between delayed rewards. Psychol. Rev. 95, 102– 114. Grace, R.C., 1994. A contextual model of concurrent chains choice. J. Exp. Anal. Behav. 61, 113–129. Grace, R.C., 2002. Acquisition of preference in concurrent chains: comparing linear-operator and memory-representational models. J. Exp. Psychol.: Anim. Behav. Process. 28, 257–276. Grace, R.C., Bragason, O., McLean, A.P., 2003. Rapid acquisition of preference in concurrent chains. J. Exp. Anal. Behav. 80, 235–252. Grace, R.C., McLean, A.P., 2006. Rapid acquisition in concurrent chains: evidence for a decision model. J. Exp. Anal. Behav. 85, 181–202. Hunter, I., Davison, M., 1985. Determination of a behavioral transfer function: white-noise analysis of session-to-session response-ratio dynamics on current VI schedules. J. Exp. Anal. Behav. 43, 43–59. Kyonka, E.G.E., Grace, R.C., 2007. Rapid acquisition of choice and timing in pigeons. J. Exp. Psychol.: Anim. Behav. Process. 33, 392–408. Kyonka, E.G.E., Grace, R.C., 2008. Rapid acquisition of preference in concurrent chains when alternatives differ on multiple dimensions of reinforcement. J. Exp. Anal. Behav. 89, 49–69. Mazur, J.E., 2001. Hyperbolic value addition and general models of animal choice. Psychol. Rev. 108, 96–112. Squires, N., Fantino, E., 1971. A model for choice in simple concurrent and concurrent chains schedules. J. Exp. Anal. Behav. 15, 27–38.