JOL!RKAL
OF MATHEblATIC.AL
Scalar Timing
PSYCHOLOGY
8,
109-I
and Semi-Markov
Chains JOHN
New
York
State Columbia
38 (197 1)
in Free-Operant
Avoidance’
GIBBON
Psychiatric
Institute, New York, New York and University, New York, New York 10027
10032
Scalar timing is proposed as the basic latency mechanism underlying asymptotic free-operant avoidance performance. Timing in free-operant schedules results in a semi-Markov chain, in which transition times may depend on the state to be entered as well as the state occupied. Results for finite chains on asymptotic state occupancy probabilities are summarized, and an explicit solution for the mean first passage time matrix is derived. Applications of these results using the scalar property provide a first order description of mean interresponse and intershock time functions for a variety of cued and uncued free-operant schedules. Occasional deviant performances appear to result from the standard scalar timing mechanism with infrequent random breakdowns.
Two basic issues in theoretical accounts of avoidance behavior are acquisition of responding and parametric control of maintained responding. This paper proposes a quantitative account of parametric control. The data subsumed by the theory are taken from free-operant avoidance experiments, since operant techniques are the ones which have been used to explore parametric manipulation at asymptote. The paper is presented in three sections: First, an extremely simple assumption about timing behavior is proposed, and evaluated in the light of the relevant data in the literature. The operation of timing in free-operant avoidance schedules results in a stochastic process which is a generalization of Markov chains, and these chains are analyzed in the next section. The analysis is not restricted to the application at hand, and is of general theoretical interest. One of the fruits of this analysis is to exhibit an intimate relationship between response rate and responseprobability; measureswhich are sometimes incorrectly assumedto covary in the substantive literature. Finally, applicationsof the theory to uncued and cued avoidanceare examined. * Reported in part in a paper entitled “An elastic stochastic timer and detection in freeoperant avoidance” read at the Mathematical Psychology Meetings, Ann Arbor, 1969. The research was supported by NIH grant GRS 5-SO-I-FRO 5650 and MH-07279. Thanks are due to A. Clymer, G. Neffinger, J. P. Towey, and J. Horn for assistance in running experiments and analyzing data.
109
110
GIBBON
I. TIMING The basic timing proposal is that (1) at asymptote each avoidance response reflects a variable, conservative, idiosyncratic estimate of the time when the next shock is due, and (2) time estimates are scale transforms of a single stochastic process. Temporal discriminations and their role in free-operant avoidance are extensively documented (Anger, 1963; Sidman, 1966). It is not clear, at present, how such discriminations are built up. Presumably, the asymptotic temporal discrimination reflects a balance between greater reinforcement for responding close to shock-due time and punishment for failing to respond in time. The way in which temporal discriminations are acquired, and idiosyncratic differences in their asymptotic form, is not critical to the present analysis. Avoidance responses are thought of as “safe” (i.e., conservative) “estimates” of the time when the next shock is programmed if a response does not in the sense that responding tends to be intervene. These estimates may be “accurate” concentrated close to shock-due time; or they may be quite “inaccurate” in the sense that response times may be highly variable. For example, responding might be uniformly distributed over the pre-schock interval. Either sort of timing may still conform to the second part of the timing proposal, the scalar property. The scalar property states that estimates are produced by a single stochastic process which is simply scale transformed for different values of shock-due time. In Fig. 1
TIME
FIG. 1. panel) time
Probability units. Shock
densities probability
corresponding (shaded
to areas)
estimates are equal.
of
I (upper
panel),
and
2 (lower
a density corresponding to a unit timer has been sketched, and below it the timer corresponding to estimates of two units. Shocks occur in the unit case when estimates exceed 1, that is, with probability 4 = 1 - F( 1). Wh en shocks are due at double this
SCALAR
TIMING
AND
time, the transformed timer simply remains the same. In general,
SEMI-RIAKKOV
“stretches” F,(rt)
CHAINS
111
the axis so that shock probability
= F(t),
(Ia)
and
where F,(t), F(t), and f$), f(t) are the distribution functions and densities corresponding to estimates of Y and unity respectively.2Equations (1) represent an extremely simple and very strong Weber-like3 assumption about timing which is immediately testable. In the standard uncued free-operant avoidance situation shocks are programmed at a constant time from a response and a (possibly different) constant time from a shock, and timing is assumed to begin with responses and shocks. In the classical and free-operant cued avoidance situation, timing is assumed to begin with the onset of the warning stimulus. In both cued and uncued avoidance, the tails of the timing distributions are of course not observable unless shocks are omitted, and data are presented in the form of latency distributions, FAN G(rt) = F,(r)
F(t) = -7 l-4
= 1,
O
f(t) Z&t) = ____
y(l ~ 4)’ = 0,
(24
o
(2b)
which are simply timing distributions truncated at shock delivery. Order statistics, means, and standard deviations are then proportional to the corresponding parameters of the truncated unit timer. Letting t,(r) be the number for which Lr(tC(r)) = c,
c(r) = f%(l)
(3a)
P.(r) = v a(r) = Y(T
(3b) (3c)
where p and CJare the mean and standard deviation of the unit latency distribution. Fig. 2 presents data from several different experiments. The Kamin study is the earliest parametric data in classical cued avoidance. Data points are averages of median 2 Lebesgue measure is assumed throughout this paper. All distributions are defined over positive half line, with finite means and variances, and continuous densities, except for degenerate distribution function, U,(t) with unit jump at t = a. 3 See, e.g., Norman (1966, Eq. 6).
the the
112
GIBBON
5
IO
20
15
WARNING
FIG. 2. Latency of avoidance for explanation of linear functions.
responding
25
30
DURATION
35
40
(set)
as a function
of warning
signal
duration.
See text
latencies of dogs in a shuttle task situation. The Low and Low, and Anderson data are mean latencies of rats on criterion trials in a shuttle task, and the Hyman data are mean latencies in the warning signal, averaged over three monkeys,4 on a freeoperant cued avoidance schedule. In the Hyman study each subject ran at all warning signal durations, while in the other studies data points are group averages with each group run at a single value. The straight lines were drawn by eye and appear to be a reasonable description of the data. Intercepts above the origin may represent motor time in executing the response. A much more stringent test of the scalar timing assumption is provided by an examination of the form of latency distributions obtained at different shock-due values. Fig. 3 presents interresponse time (IRT) distributions and shock probabilities in an uncued free-operant avoidance task obtained from two rats studied in this laboratory, and from the only other study in the literature (Verhave, 1959) reporting IRT distributions for a single subject at several different response-shock values (r). For the data from this laboratory, the shock-shock clock, which programs the time between shocks when no responses are made, was held constant at 5 seconds (s = 5 set), and values of Y were run for a minimum of 60 hours each. Data shown are for the last 20 hours at each r value. The bin sizes are proportional to r, and so scalar tirning should superimpose relative frequencies. Agreement is reasonably good for thse data from this laboratory (with the possible exception of r = 25, particularly fo.r rat 2-Q and shock probabilities (4) are very close. There is more variability in the Verhave data; however these data will be seen later to be appropriately scalar in the mean. p One other the average.
subject
in Hyman’s
study
showed
strong
order
effects
and has been omitted
from
SCALAR
TIMING
AND
SEMI-MARKOV
113
CHAINS
Verhave r
F6 0
250 15r IO .
z III
r
9 o * 0
Rot 4-o
,
q
25 15 IO
. I .
o A 0
PROPORTION
mt2-R
r
Ii
50 30 30
‘59
Rat I I T .,I,
0131
OF r (f)
FIG. 3. Relative frequency distributions of IRT’s recorded in fifths (left two panels) and tenths (Verhave) of shock due time (Y). Open points at 1.0 for rats 4-O and 2-R are shock probabilities in the r clock. The number of responses per distribution for rats 4-O and 2-R exceeds 4000.
Anger ‘63 Rat I r=l5, s=IO . R--R I S-R . S--S--R . S--S-S--R
Rot 2-R
s
25. 15 IO s=5
A . .
0 0 0 0
II
, 25 15 IO s=5r
. . .
PROPORTION FIG.
Ordinate
4.
Conditional values are
probability
P(R
480/8/1-8
1 t) =
distributions frequency frequency
q 0 A 0 P
OF
?
p
r OR s (+)
normalized
for shock-due
of R’s in bin n of R’s in bins
k > n ’
time
(as in Fig.
3).
114
GIBBON
shock- response latencies should have the same Llnder the scalar assumption, functional form as response-response times. Data are sparse on this point, but the available evidence indicates that timing is not identical in the two clocks. RIost of the difference appears to lie in a high frequency of short post shock latencies, so that when distributions are plotted in conditional density form (IRT per opportunity) much of the difference disappears. In Fig. 4, conditional density plots are presented for the two rats described earlier, along with data reported by Anger (1963) for a response-shock value of I5 seconds, and a shock-shock value of IO seconds.Post shock functions for this rat are presented separately for responding after 1, 2, and 3 shocks in a row. The functions for the two clocks are roughly similar when plotted in this form, particularly as shock-due time is approached. For the purposes of the later analysis it is not critical that timing be identical in both clocks, but it is important that it be scalar over changes of clock settings for each clock separately. There does not appear to be any evidence in the literature on whether changes in s produce scalar shock-response latencies and it will be assumed that the IRT pattern is typical. The relative invariance of functional form over successive shocks indicates stationarity for the process in this clock. There is evidence that stationarity is not obtained for successive responses in the Y clock (Wertheim, 1965) and so the scalar timer for IRT’s must be viewed as a pooled distribution form. One way in which a mixture with the scalar property might arise will be considered later. The evidence cited seems sufficiently strong to warrant adopting the scalar timing assumption as the basic latency mechanism underlying asymptotic avoidance performance. However, nearly all of the data in free-operant avoidance experiments are reported in rate, rather than distribution form. Prediction of response and shock rates requires the analysis of semi-Markov chains presented next.
II. SEMI-MARKOV
CHAINS
Semi-Markov chains are generalizations of Markov chains introduced independently by Smith and Levy in 1954. Applications occur in such diverse fields as Counter theory (Pyke, 1958), business management (Howard, 1963), and ethology (Cane, 1959). History, credits, and several new results for chains with a finite number of states are presented in two review papers by Pyke (1961a, b). Two-State
Chain
By way of introduction to general N-state chains the simple 2-state chain for uncued free operant avoidance is presented first. Operation of the response and shock clocks are the states in the chain, and transitions occur with fixed probabilities and fixed or variable times depending on the transition made. The chain is diagrammed in Fig. 5. In the response-shock clock (state R), subjects recycle the clock with
SCALAR
FIG. 5. free-operant
TIMING
State diagram for the avoidance schedule.
AND
semi-Markov
SEMI-MARKOV
chain
115
CHAINS
corresponding
to the standard
uncued
probability I - qR at variable times (t sR < r). Subjects get shocked when they allow r units of time to elapse without a response (t,, = Y) and this event occurs with probability qR . The shock-shock clock (state S) recycles with probability qs every s units of time t,, = s). When a response occurs in the S clock (tSR < s), a transition is made back to the R state, and this event occurs with probability 1 - qs . The embedded 2-state chain has constant transition probabilities, but the transition times depend on the state to be entered as well as the state occupied, and the system is not, in general, Markov in time.5 Procedurally, the system defined by Fig. 5 is a discrete trials procedure with no intertrial interval, in which trials begin and end with responses or shocks, and have variable durations. In the context of Fig. 5, response and shock rates are the reciprocals of the mean recurrence time for each state. In Markov chain terminology, the mean (unconditional) interresponse and intershock times are given by the main diagonal of the mean first passage time matrix. In this example, mean first passages are obtainable directly from the tree measure using the latency distributions (Eqs. 2) for the variable transitions and taking expectations over latencies and tree paths. However, the tree measure becomes cumbersome for chains with more than two states, and more general methods exist which exhibit important relationships between the probability of state occupancy and first passages. In the next section the general chain is analyzed. The development follows that of Pyke (1961b) and Howard (1963). Th e major relationships are summarized and the formulation for mean first passages, given by Pyke in limit form, is made explicit.
N-State Chains A semi-Markov chain is uniquely butions, Q(t) = C&(t)}, defined by
Q&)
determined
= cww),
by the matrix
distri-
i,j = I,2 ,..., N,
5 A chain which is Markov in time has constant (instantaneous) conditional at most on the state occupied. This results in exponentially times (see e.g., Feller, 1966, p. 316). 480/8/I-S
of transition
(4) transition probabilities distributed transition
116
GIBBON
function governing the where i,j are states in the system, Z,,,(t) is the distribution transition i ---f j, and c = Ic. ,, ,,I is the matrix of transition probabilities for the embedded discrete Markov chain. For the 2-state chain in Fig. 5, L,,and L,, are truncated timing distributions, and L,, and L,, are step functions with unit jump at Y and s respectively. The definition given by Eq. 4 means that, given that state i was entered at time .Y = 0, &(t) is the probability that the next transition is to state j at time .r < t. The unconditional “wait” in state i then has distribution function,
The probability of a transition from state i (possibly back to state i) before t, is the ith row sum of Q(t). The asymptotic properties of the system are obtained from two closely related functions, Pij(t) and Gij(t). Pij(t) is the probability that state j is occupied at time t, given that state i was entered at time zero, and Gij(t) is the distribution function for first passage times. That is, Gij(t) is the probability that the system, initially in state i, enters state j at least once before t.6 A basic recursive relation on Pij(t) is
where Sij = 1, 0 for Equation 5 may be transition must occur state K (possibly = j) must end up in statej. of events for all x < state j before t. A similar recursive
Pii
i = j and i # j respectively, and pi+(t) is the density of &(t). justified on probabalistic grounds. When i #j, at least one for the system to end up in state j. If the first transition is to at time, x < t, then in the remaining time, t - X, the system The convolution integral gives the probability of this sequence t.7 When i = j the additional possibility exists of not leaving argument
provides
a relationship
between
Pii
and GJt).
= 6ij[l - Hj(t)] + j’gij(~) Pjj(t - X) dX. 0
6 Formally, these functions are defined by the stochastic processes, {XJ, { Yj(t)}, where X, records the state occupied at time t and Yj(t) is the counting process which cumulates entrances into state j. Then with the convention that the time origin is restricted to transition instants, Ptj(t) = P(X, = j 1 X0 = z’) and G,,(t) = P( Yj(t) > 0 1 X0 = i). ’ It should be noted that for constant transition times, I, from i to k (e.g., R - S in Fig. 5), q<,(t) = Q:,(t) is degenerate and the convolution integral in (5) becomes c,J’~,(~ - Y) for t > T, and zero otherwise.
SCALAR
TIMING
AND
SEMI-MARKOV
117
CHAINS
The probability of being inj at time t, is constructed from a first passage toj before t, and the probability of ending up in j in the remaining time. The relations (5) and (6) are not easy to solve in their present form, but are quite tractable with Laplace transform methods. The following notation for Laplace transforms will be used, preserving the distinction between upper and lower case letters for distribution functions and densities; v*(h) = sr e-%(t) dt, X > 0. The convolution properties of Laplace transforms (see, e.g. Feller, 1966, pp. 407-419) are, in this notation; for U, V, W distribution functions (possibly defective, as, for instance, &( CD) = cij), if
u(t) = Jt v(x) w(t - x) ax, 0
then U*(h)
= v*(x) W”(h)
= q%
Note also that for $(t) = 1 - U(t), C*(X) = (l/h)[l - u*(h)] generates of u(t) in “tails” form (Feller, 1966, p. 412; McGill, 1963, p. 353). Taking Laplace transforms of (5) and (6) we have
sP,“;:(~)
=
k
[1
-
h,*(A)]
Sij
+
f,
Ljz*(x)
the moments
(5a)
P$(h),
k=l
and P,*,(A) =; Multiplying
through
[l - h,*(A)] sij + g;(h)
I$(,\).
by h, (5a) and (6a) take on the matrix
p*(h) = I-
(64 forms,
h*(h) + q*(h) P*@),
(7)
and
p*(h) = 1 - h*(h) + g*(h) P.%% where I is the identity h*(h) = {&h;(h)), p:(A)
matrix, and h*(h) and p:(X) are the diagonal = {Bijp$(X)}. The solution of (7) is evidently p*(X)
and solving
= [I - q*(h)]-l[I
- h*(X)],
(8) matrices,
(9)
(7) and (8) for g*(h) gives
!z*m = q*(h) P*(~NP%wl when the indicated inverses Aa, B, > 0, then [(AB,),]-l of (9) into (10) gives
(10)
exist. But if A, B are matrices with nonzero diagonals, = B;‘A;l. S’mce I - h*(h) is diagonal, substitution
g*Gv = q*Gw - q*ww
- s*Nl-‘w.
(11)
Thus (9) (IO), (I I) are satisfied if I - q*(h) h as an inverse. This fact is easily established by a discrete Markov chain construction. Note that glj(m) : c,, , so that, for fixed X > 0, q:(h) < cij , f or Cij > 0. Thus the row sums of q*(h) arc less than one. q*(h) therefore qualifies as a transition submatrix of transient states in an absorbing chain, and the inverse [I - q*(A)]-’ is th en g uaranteed as the fundamental matrix of the absorbing chain (Kemeny and Snell, 1960, p. 45; see also Feller, 1964, Eq. 3.5 for this solution in a countably infinite state space). Equation 9 is Theorem 4.1 and (10) and ( I I ) are Theorem 4.2 of Pyke (196 1b). They express the basic relationships between first passages and occupancy probabilities. From (10) one would expect an intimate relation between the moments of G(t) and the limiting behavior of P(t). This turns out to be true, but the results are not immediate, as p*(O) is not directly obtainable from (9) in its present form. For the semi-Markov chains considered here, the embedded discrete chains are regular (see, e.g., Kemeny and Snell, 1960, Chap. IV), and so have a Iimiting probability vector, a = {aj}. Howard (1963) h as shown that for these chains Pij( co) is expressable in terms of a. His development will be summarized and then used to obtain an explicit solution for mean first passage times. Asymptotic
State Probabilities
The basic Tauberian theorem (Feller, 1966, pp. 418-420) Now the mean unconditional “wait” in each state is given by
gives P(W)
= p*(O).
ii = f;s (; [I - h*(h)]j, where
p is the diagonal
matrix
with
(12)
entries
/Z<= 5
CijtLij
j=l
and pii is the mean of&(t).
Thus,
recasting
the limit of (9) as h --f 0, we have,
P(a) = [kz w - q*(w)lik and the problem
is reduced
to obtaining
the indicated
limit.
(13) Let
a(h) = A[1 - q*(h)]-l. Multiplying
on the right by I - q*(X),
a(h) = XI + a(h)q*(h). Since q*(O) = Q( co) = c, the limit in (13) must be a solution
a(0) = a(O)c.
(14) of
(15)
SCALAR
TIMING
AND
SEMI-MARKOV
119
CHAINS
Since Q is the unique probability vector for which CL = ac, a(0) must proportional to u. But each row of P( 03) is a probability distribution, so
have rows
where
is the proportionality constant of i, and P(a) has elements,
for the ith row. Thus
these constants
are independent
independent of i. This result is given by Howard (1963, Eq. 29) and developed in essentially the same way for discrete semi-Markov chains with denumerable state space by Anselone (1960, Theorems 2 and 4). Equation 16 has an important intuitive interpretation. It is a generalization of the similar result for the imbedded chain, and states that the long run proportion of time spent in each state is independent of where the system starts. Equivalently, the asymptotic probability of observing the system in any state is the asymptotic probability for the discrete chain, weighted by the expected sojourn time in that state, and then normalized to sum to one. Mean First Passage Times The mean recurrence times may now be obtained in terms of the limiting probabilities. From the diagonal entries of Eq. 8, we have immediately I - g;(X) Dividing
by X and taking
limits
= [I - h*(h)][p,*(h)]-l.
state
(‘7)
as X + 0, m,
=
pn-1
=
ba-1
(18)
where m, is the diagonal of the mean first passage time matrix, m, x is the diagonal matrix of limiting state probabilities, rrj = P,J cc), b is the (inverse) norming constant, b = X:=1 I+/& , and a is written in diagonal form, a = {S,+xj}. Equation 18 is an explicit form of a result due to Smith (1955, Theorem 5; see also Pyke, 196lb, Corollary 7.1). The mean recurrence time for any state evidently depends only on the weighted sum of sojourn times, b, and on the limiting state probability, not directly on the mean wait in that state. This is a result of the symmetry of recurrences as opposed to first passages, and it will be seen that (18) is a consequence of a more general recursive relation on m.
120
GIBBON
It is straightforward, but tedious, to show that the off-diagonal elements are given by I + g*(x) - g?(h) -- [I - q”(A)]-‘[I -- g,x(h)]. The deduction follows from substitution Subtracting (19) from the unit matrix, have, 1 - I + AI - [g*(h)
of g+‘(h) (19)
of (7), (9), and ( 17) into the right side of ( IO). 1, with all elements unity, and adding AJ, we
- g&4)]
= 1 + AI -
[I - q*(h)]-1[1
- g;(h)].
(20)
The left side matrix is now in a form in which division by h and passage to the limit gives I + m - m, . However, the matrix on the right is still indeterminate in the limit. Note, however, that the row sums of q*(h) are just h*(h), so that q*(X) 1 = h*(h) 1. Using this fact, premultiplication of (20) by I - q*(A) gives
[I - q*(U 44 = [I - h*@)l 1 + 4Id(X) is th e matrix
where
[I -
q*(h)1- [I - g&Yl,
(21)
on the left of (20). Now we may divide by h and take limits, c][I + (m -
m,)]
= pl + I -
c - m, ,
(22)
whence (I -
c)(m -
m,)
= pl - m, ,
(22a)
or m = jil + c(m - m,).
Pb)
Equations 22 are the analogues for semi-Markov chains of similar expressions in discrete chain theory not involving p. It is easily shown that a solution of (22b) is unique (a proof follows directly that in Kemeny and Snell, 1960, p. 79), and the solution for m, obtained in (18) is an immediate consequence of multiplying (22b) through by the limiting matrix, la. This results in lam,
= IajA,
and the matrix on the right has all elements = b, which proves (18). Recently Kielson (1969) has obtained a single inverse solution elements of m.8 The solution may be written.
for off-diagonal
m = (ly, - y + 1) bar-l, * Kielson’s nally
proposed
work here
appeared
while
this
paper
was
under
review.
(23) An
alternative
solution,
origi-
is ,j
~
where m3, jiJ are the column vectors, (m,,:, deleting the jth row and jth column. For the see Kemeny and Snell(l960, pp. 121-123).
(I-
c”)-l$
$,I, i f counterpart
j;
and cj’ is the of this analysis
minor of c obtained by in discrete chain theory
SCALAR
TIMING
AND
SEMI-MARKOV
121
CHAINS
where y = z(c - b-lplu), and 2 = (I - c + la)-] is the fundamental matrix for the discrete chain (Kemeny We show that (23) satisfies (22b). From (23),
and Snell,
1960, p. 75).
m - m, = (ly, - y + 1 - I) ba-l, since m, = ba-l solution (23)
(IS).
Multiplying
c(m Thus
- m,)
by c on the left and substituting = m -
the proposed
[c - (I - c) y] hapl.
(22b) is satisfied if [c -
(I - c) y] ba-l
== Cl.
(24)
But the fundamental matrix, z, has the property that (I - c) z = I - la (Kemeny and Snell, 1960, p. 75). The matrix in brackets may therefore be written, c - (I Multiplying
c) y = c - (I -
la)(c
-
b-lpla).
out the matrices on the right and rearranging
gives
c-(I-c)y=(6-1~+I-la6-1~)la. Now substituting
(25)
(25) into the left side of (24), [c - (I - c) y] bar-l = &l + 16 -1apl.
But the rightmost matrix has all entries = 6, and so (24) is satisfied. The solution (23) holds also when the chain is Markov in time, with pij = ,i& , all j. When all transition times are unity, the semi-Markov chain becomes a regular discrete chain, p = I, 6 = 1, and y = z - I. The solution (23) then reduces to m = (12, - 2 + I) a-l for the discrete chain (Kemeny and Snell, 1960, p. 79). This concludes our analysis of semi-Markov chains. The results have been restricted to first order properties. At present, quantities such as the variance of first passage times and serial correlation statistics-which should depend only on the corresponding parameters of Q(t)-appear to require a knowledge of q*(X) for computation from (11). The studies we will consider here, however, have not reported variances and so the above results will suffice for the data at hand.
122
GIBBON
III.
APPLICATIONS
In this section, the results obtained above are applied to parametric data in the avoidance literature. Uncued avoidance schedules are analyzed first, then two cued avoidance schedules are considered. Finally, a qualitative approach to the problem of sequential and temporal dependencies is presented in a discretized time context. Uncued Avoidance Parametric manipulation of the clock settings, r and s, have been studied by several investigators and are known to produce large changes in response and shock rates (see, e.g. Sidman, 1966). It turns out that much of this change in rate may be understood in terms of the scalar timing property alone, and that no additional assumption about changes in response probability, or “strength,” is required. Mean first passages for the 2-state chain have a particularly simple form. a is well known to be given by =
aj
cii
-,
and the solution
i #j
+
[ij
= 1,2,
cjj
(23) may then be written mij
=
z
Pii
+
/4j
i #j
,
= 1, 2.
Now using (26a) mean recurrence times (18) may be put in a form with an intuitively appealing interpretation. Note that from (26a), ciimij = (ai/aj) pi . Thus expanding (18) for this case gives mjj
=
cjjPj
+
cji(&i
+
i #j
mij),
= 1,2.
(26’~)
Equations 26 have a simple interpretation. The mean time to go from one state to the other is the (geometric) mean number of recyclings in the starting state times the mean recycle time, plus the mean time for the final transition to the other state. The mean recurrence time for either state is the mean recycle time weighted by its probability, plus the mean time to enter the other state and then return, weighted by its probability. Using the scalar property and the notation of Fig. 5, mean recurrence times for the R and S states are mRR
=
r[l
-
(I
-
vR)(I
-
qR)l
+
‘[I
-
(I
and m SS=-m
1-G qR
RR ’
-
d1
-
&)I
&
(27a)
SCALAR
TIMING
AND
SEMI-MARKOV
123
CHAINS
where pR , ps are the (possibly different) means for the unit truncated timers in each clock. When postshock and post response timing distributions are scale transforms of a single unit timer, qR = qs and pR = ps . Equations 27 mean that mRR and mss are proportional to r and s-they define planes in the positive quadrant of the r, s, mjj space which pass through the origin. Appropriate limiting behavior for good and poor avoidance in either clock are clear in the following obvious limits; VR lllRR
=
y +
SCLS
I co m ss =
for qR = 0, forq,=l, forqs>l,
qs < 1 qs=O
(284
qs=l
00 for qR = 0, q.s < 1 qs=O r + SPS forq,=l, s for qR > 0, qs = I. I
(28b)
The top rows of equations 28 represent perfect avoidance in the r clock, in which case shocks never occur. The second rows are the case in which the embedded chain is periodic, and so neither state recycles and mRR = m,, . The bottom rows are the case in which the shock state is absorbing. The range of r and s values over which (27) might be expected to hold is restricted by at least two considerations. First, for very large clock settings, one might expect scalar timing to break down for the same reasons (whatever they may be) that Weber’s Law breaks down. In the studies we will consider, this limit does not appear to have been exceeded. A second and fundamental consideration has been noted and documented by Sidman (1953) in his original investigation. When r is small relative to s, shocks delivered by the r clock appear to act as punishers of the preceeding behavior; so that in the limit as r is decreased towards 0, responding is eliminated. Equation 27a, of course, predicts the reverse; response rate is inversely related to r. This punishment effect requires the introduction of an additional process and will be treated in a later paper.g Studies following Sidman’s original investigation have generally used equal clock values, and varied both clocks simultaneously. The punishment effect does not occur in this situation. When r = s, (27) reduces to proportionality in one variable. Since there are then only two constants, we may without loss of generality assume them to be q and p for a single unit timer. Equation 27 is then
where a = 1 - (1 - p)(l mss = ri 8 Reported
in part
in the
paper
cited
in Footnote
1 (m.s.
in preparation).
- q).
124
GIBBON
In Fig. 6 mean interresponse and intershock time data from several studies are plotted on double log coordinates. Straight lines with unit slope have been fitted by eye to each set of data and parameter values for each subject extracted via (29). Proportionality is virtually forced for the response data of these subjects, and is a reasonable description of the shock data, which is based on a smaller sample. The
FIG. 6. Mean interresponse times (filled points) and intershock a function of r = S. The data for Y = s = 60 for monkey 518 (square an original and redetermination value.
times (open points) points) are averages
as of
parameter estimation is of course rough, but a qualitative point is worth comment. Low q values (producing wide separations between the response and shock functions) are associated with low p values. A positive correlation of this sort is consistent with our timing proposal, since one might expect a high proportion of overestimations to be associated with a large concentration of mass in the distribution close to shock-due time. In fact, it is tempting to speculate that unit timers for different individuals may themselves be scale transforms of a single density. In this case the correspondence noted would be perfect, with an individual’s tolerance for shock, q, uniquely determining the associated p. Deviations-Scalar
Timing
with Breakdowns
Not all of the subjects in the experiments listed in Fig. 6 show perfect scalar timing. Deviations generally may be characterized as showing more efficient shock avoidance at longer r = s intervals. In terms of the functions in Fig. 6, deviant m,, and mRR
SCALAR
TIMING
AND
SEMI-MARKOV
CHAINS
125
curves appear concave upward and downward, respectively. Subjects get fewer shocks and make more responses as Y increases, than would be expected from scalar timing alone. A modification of the scalar timer may be devised which handles this deviation. Suppose that on each trial subjects attempt to estimate time in the usual way, but that their scalar timer is subject to occasional failures which are independent of the time since the beginning of the trial. When breakdowns occur, subjects reset the timer with a response and begin a new trial. Anthropomorphically, subjects may be thought of as resetting their timer whenever they become “worried” or “confused” as to how far into the trial they have gone. Breakdowns are then distributed exponentially and response-response times on any trial are distributed as the scalar timer or its exponential competitor, whichever occurs first. The tails distribution function for the modified timer is simply the probability that neither system has produced a response by time t. 1-
II&)
= @[l
-F&)],
(304
where VV(t) is the distribution function for the modified estimator of r, l//3 is the mean of the breakdown distribution, and F,(t) is the distribution function for the scalar estimator of Y. Shock probability is then I -
I/,(r)
= qecBr,
(3Ob)
where q = 1 - F( 1) as in Fig. 1. Clearly (30b) results in the desired sort of deviation since shock probability decreases with increasing r. A direct test of (30b) requires monotoring failure rates in the two clocks, and these data are not generally reported. However, a test may be constructed from (27b), which follows directly from (18), and does not depend on the scalar assumption. If timing is the same in both clocks when r = s, then the breakdown model has qR = qs = qe-0’ and (27b) may be recast as
---
*RR
*RR
f
= qR = qe-@. *SS
The left side of (30~) is simply the ratio of shocks to responses plus shocks, or trials, and should remain constant if timing is scalar. In the experiment by Clark and Hull (1966) three subjects were studied, only one of which appeared to use scalar timing exclusively (Fig. 6). In Fig. 7, the relation (30~) is plotted for the three subjects on semilog coordinates. The straight lines were fitted by eye and appear to be a reasonable description of the two deviant performances. The dashed horizontal line represents the value of q used to fit this subject’s data in Fig. 6. Additional subjects in the study by Hake (1968) other than the one plotted in Fig. 6, show the same trend toward decreasing shock probability with increasing r; however, the data are much more variable and a clear exponential form for (30~) is not readily discernable.
126
GIBBON
RA
IO
58(m)
30
20
40
50
60
r= s FIG. 7. Log shock probability of the formula on the ordinate.
as a function
of r = s for three
rats. See text for the explanation
The breakdown model will not be studied in much detail here, except an expression for the mean of the truncated modified timer. Even this is not diate consequence of (30a) and turns out to depend on the exact form of timer. The mean of the truncated system may be obtained in tails form from and 30b.
CL(~) = j”’0 [ 1 - , “;!or] where the integrand is the tails distribution Equation 31 may be recast in terms of the truncated by substituting (2a) into (30a) 1-
VT(t) = cat ([l
Now, with the change of variable, p(r) = [l ~ pe-@Pr
- Q] [1 -L
(+)I
Eqs. 30a
(31)
function of truncated process. scalar unit timer L(X) 0 < x < 1
+ Cj),
x = t/r, integrating [( 1 - 4) r+*(/3r)
dt,
to obtain an immethe scalar
0 < t < r.
(32)
(31) gives
+ j$ (1 ~ e-BY) - rqe+]
(33)
SCALAR
where $*(A) that
is the Laplace
TIMING
transform
AND
SEMI-MARKOV
127
CHAINS
of 4(x) = 1 - L(X), 0 < x < 1. Using the fact
c+*(x) = ; [I - z*(h)], some algebraic
manipulation
in (33) results in
P(T) = $ -
[l - qe-“PI-1 Irqe-“’
+ Q+
Z*(pr)].
In the above form, it is clear that (35) since terms in the rightmost factor vanish. The limit (35) means that as r grows large, the breakdown process dominates the mean. The approach to this limit is from below, which makes intuitive sense since one expects the modified system to have a smaller mean than either of the components alone. This intuitition may be shown to hold true for the scalar component also by expanding Z*@r) in series in (34). The analogue, in the modified process, of the expressions relating mRR and mss to r for scalar timing, i.e., Eqs. 29, may now be obtained by using (34) in the solution (26b) for rnii . Setting r = s, qR = qs = qecor, and pRR = pss = p(r), some algebraic rearrangement gives z*(Br) mRR = -l [I - (1 1- - 4)qe-sr ' P
1
1
lllRR
,
and again it is clear that mRR+ 1 /pas r+ 00, while mss is unbounded with increasing r. This limiting behavior of mRR is intuitively appropriate since p(r) + l/p, and the contribution to mRR of occasional entrances into the shock clock vanishes with shock probability (30b). An explicit expression for the functions mjj requires a knowledge of the form of the unit scalar timer or its Laplace transform. In the absence of evidence on a particular shape for the scalar timer, a three-parameter approximation for Eqs. 36 may be obtained by assuming no variance in Z(t). Letting L(t) be the step function with unit jump at p, Z*(/3r) = exp(-/lrp). The theoretical curves in Fig. 81° were obtained using this approximation. The data are for the deviant subjects in the Clark and Hull lo The double log coordinates have no special theoretical merit here, except that they encompass functions ranging over about two orders of magnitude. Also they visually correct for increasing variance with means, which is a property of first passages with scalar latencies.
128
GIBBON
dark
8 Hull
66
2000 I RA-62
q-.222 fiJE.555
1000
RA-58
0=.0450
q=.220
p=.298 mss
400 200 IOO-
r=s FIG. 8. from scalar
Mean interresponse timing.
and intershock
times
versus
r = s for two rats showing
deviations
study, with the 4 and /3 parameter values extracted from the lines in Fig. 7. The values of p were obtained by requiring that mjj pass through the data points at Y = 20. The dashed line in each panel is the mRR asymptote. The theory provides an exceedingly good description of the data for these subjects; however, the generality of the breakdown model is questionable, since attempts to fit (36) to additional subjects in the Hake study have not been very successful. It is not clear at present to what degree the zero variance approximation for I*&) restricts the usefulness of (36). Escape-Avoidance Another situation in which the punishment effect does not occur results from the standard free-operant schedule by eliminating the shock-shock clock. Shock remains on continuously until terminated by a response. Assuming that escape responses always (eventually) occur, the chain in Fig. 5 has qs = 0, and the escape latency, t,, , is defined over the positive half line with expectation, psR = ps. Mean recurrence times (26b) then become, llRR
=
FR
%S
=
(l/qR)
+
qRpSR
P
(37a)
and mRR
.
(37b)
SCALAR
TIMING
AND
SEMI-MARKOV
129
CHAINS
Scalar timing in (37) would produce linearity in r. Unfortunately the only parametric data on this schedule come from a single subject in Sidman’s original study. This animal was studied at a broad range of r values, and his data deviate systematically from linearity in the direction predicted by the breakdown model. In Fig. 9 these data are plotted with the theoretical curves given by Eqs. 37 where qR and pRR = p(r) are obtained from Eqs. 30b and 34, respectively, for the breakdown model. The zero variance approximation for Z*(/3r ) was used in these functions with the exponential component determined by assuming that mRRE l/p at Y = 150. The fit of data to theory is reasonably good for the response data and not very good for the shock data. An examination of the qR plot (30~) suggests that this subject may have been using two different breakdown distributions: the one proposed in Fig. 9 for r > 15, and one with a smaller mean at the shorter r values. One of the results of this difficulty is that the mean escape latency used to fit the data is unreasonably large. This parameter Sldman Rat 41;
‘53 s:o
1000 -
mss 100:
I-
2
4
IO
20
40
100
200
r FIG. operant
9. Mean interresponse escape-avoidance.
and
intershock
times
versus
Y for
one
subject
on
free-
130
GIBBON
is increasingly important in the mj, functions as Y becomes small, since lnss - PSR and mZRR+ qpLsR as Y ---, 0. It is difficult to assess the adequacy of the theory for the escape-avoidance schedule since data for only one subject is available. The breakdown model was developed as an adjunct to scalar timing to describe deviant performances, so that the adequacy of the theoretical description in Fig. 9 carries less weight than the similar functions for the r = s schedule (Fig. 8) for which there is strong evidence of exclusively scalar timing (Fig. 6). Cued Avoidance When warning signals are introduced into the uncued avoidance situation, the result is a 3-state chain. Two sorts of chain may be distinguished which differ with respect to whether or not the s clock recycles in the warning stimulus. In Fig. 10 two freeoperant chains are diagrammed along with the chain for classical discrete trial avoidance. Consider the classical paradigm first. Trials begin with the onset of the warning stimulus, state W. If no response is made within w seconds, a shock occurs -state S-and an intertrial interval of s seconds is initiated, followed again by W. When a response occurs in the warning signal in time to avoid shock (t,, < w), a transition is made to the R state and the following intertrial interval is equal to the S---f W interval plus the amount of time “saved” by responding (w - t,,). Rate
FREE-OPERANT
CLASSI CAL
A
FIG. cued
10. avoidance
State diagrams schedules.
B
of semi-Markov
chains
corresponding
to free-operant
and
classical
SCALAR
TIMING
AND
SEMI-MARKOV
131
CHAINS
measures are not applicable in this situationll since subjects only control transitions from state W and all the relevant data is contained in the latency distribution, LW, , and the probability of avoiding, 1 - qW . The classical paradigm has been included to illustrate its close relationship to the free-operant cases. The central difference is that the free-operant schedules allow for three different kinds of trial, one for each state, just one of which-that beginning with the warning signal-is the same as a classical avoidance trial. The free-operant cases differ with respect to transitions from the S state. Chain A programs shock at a fixed time, w, after the onset of the warning stimulus, state W, if no response occurs. Shocks initiate a shock-shock clock which delivers shocks every s seconds in the presence of the warning signal until a response occurs. A response in either the signal-shock clock, or the shock-shock clock terminates the warning signal and initiates a response-signal clock which may be recycled as in the uncued case, and which times out with the onset of the warning stimulus. Schedule B is the same as A except that shocks terminate the warning signal and the S state clock times out with the onset of the next warning signal when no response is made. These schedules have important differences, which unfortunately are obscured by reporting response rate data only. For instance, one might expect estimates of shock-due time to have a different form, or at least a lower q, than estimates of the time for warning signal onset. Then for scalar timing, schedule A should have qW = qs < qR, while schedule B should have qw < qs == qR. Even for nonscalar timing, one might expect these relations to hold when w = s = r. The chains in Fig, 10 are special cases of the general three-state chain in which all transitions are possible. Other cases have also been studied. For example the “anxiety avoidance” schedule (Sidman and Boren, 1957) results from B by eliminating the W--f R transition, that is, by making responses in W ineffective. The signal then cues unavoidable shock and avoidance behavior is restricted to responding in the responsesignal clock. Another variant of interest which has not been studied results from B by eliminating the R - R transition. Such a schedule would share features of both classical and free-operant procedures. Schedules A and B have been selected for study here, as they appear to be the only variants for which parametric data are available on changes in the clock settings, Y, w, and s. Shock rates have not been reported for chains A and B and so the analysis will be restricted to obtaining an expression for 7nRR . The c matrices for the two chains are
A Chain
$;;
;
11 In any case, the embedded discrete chain analysis which assumes a regular
B Chain
fl
g;“;
chain is periodic embedded chain
i
;]
with period 2, and so the semi-Markov does not apply.
132
GIBBON
Since there is only one possible sort of response latency in each clock, denote its mean by pj(xj), where xj = Y, zu, s for j = R, W, S, respectively. The mean unconditional wait in each state is, with this notation, pj(xj)
=
(l
-
Qj) Pj(xj)
+
which holds for both chains A and B. When not necessarily across clocks), (38) becomes tLj('j)
=
xjcj[(l
-
93)
Pj
+
qj]
9
j = R, W, S,
timing
is scalar within
qpyj
=
xjFj
(38) each clock (but
j = R, S, W,
)
(38a)
where pj is the mean latency in state j for the unit timer. Now the mean interresponse time for either chain is given by (18) as
when the 01~exist. But c” for chain A, and c3 for chain B are positive, so both chains are regular and the 01j may be found by solving a row of lac = la. The solution for both chains has the form cdj =
i,j = R, W,S,
gy,
where the Kj are constants assigned differently For
R,
j =
W
K,=\l-qS, ’
The giving,
scalar assumption
1 1 -
for the L4 and B chains.
qR(l
-
s, 4s),
qR,
qWqS>
(38a) and the solution
!?RqW
;
.4-chain
4RqW
;
B-chain.
(40) may be substituted
Kw “RR
=
$R
+
WC,
K
KS R
+
Sj&-
K
R
(41) into (39)
(42)
Mean interresponse time is proportional in Y, zu, and s, similarly to the result for Y and s in the two-state chain (27a). Ulrich, Holz, and Azrin (1964) have investigated schedule A for the case in which s was held constant, s = 5 set, and w, Y varied such that their sum was constant at 20 seconds (r = 20 - w). These restrictions in Eq. 42 give mRR as a linear function of Y. Data at a variety of r values were reported for only one subject, and these are
SCALAR
TIMING
AND
SEMI-MARKOV
133
CHAINS
replotted here in Fig. 11. Linearity is well substantiated for this animal. The scalar parameters have not been extracted from the line since a two-parameter fit requires assuming the same unit timer for all three clocks-an unlikely assumption since it is well known that many more responses occur in, than out of, the warning signal. In fact, without shock rate data, many features of the performance remain indeterminate. For instance, linearity is also produced by assuming the scalar property holds in one of the r, w, clocks, and that latency is constant in the other. Llrlch Rat
et 01 ‘64 S-179
I
I
I
5
IO
15
r FIG. 11. Mean The response-shock
interresponse time for one rat as a function of the response-signal duration (r + w) was held constant at 20 sec.
duration
(r).
Schedule B has been studied by several investigators, but parametric data has only recently been available (Hyman, 1969). Hyman ran monkeys at a variety of Y = s (“safe”) times for each of three warning signal durations. Four subjects were run, three of which appeared to be relatively free of order effects, and their latency data in the warning stimulus were considered earlier (Fig. 2) and found to be approximately linear in the mean. For this case it was suggested that the intercept in the latency data reflected mean execution time for the response. In Fig. 12 data have been averaged over subjects and linear functions have been plotted for each warning signal duration assuming that the same fixed motor time (M) is required in both the r = s, and w clocks. The linear functions are instances of Eq. 42 using the B chain assignment (41) with the adjustment in (38) that pj(xj) = M + xjpi , with M and pW determined from the latency data in Fig. 2. This adjustment means that the intercept of the mRRplane at Y = w = 0 is the constant, M. This is reflected in the r = 0 intecepts of the func-
134
GIBBON
lTIRR
FIG. 12. (r = S) time
Mean interresponse time duration, at each of three
averaged over three monkeys signal-shock durations (XXI).
as a function
of the “safe”
tions in Fig. 11, which themselves are linear in w with intercept at M. The data are well described by the theory except at the longest safe duration, Y = 40. The deviation at this value is in the direction predicted by the breakdown model, and the dashed functions are the predictions for w = 2 and w = 20 assuming a very slow breakdown component operating in the I clock. The qualitative observation frequently reported for cued avoidance that the warning signal comes to control most of the responding is reflected in these functions by the high warning signal probability (qR = .74) and low shock probability (qW = .02). Shock rates were not reported as functions of Y; however, they were reported to be generaliy lower than one every four hours. The parameters used in Fig. I2 predict somewhat higher shock rates except at long Y values. It is not clear whether this represents a real discrepancy without parametric data on intershock times. Discretized
Time
In the standard uncued avoidance times (R + R latencies) are collected
situation (Fig. 5) suppose that interresponse in N time bins of width r/N, and that shock-
SCALAR
TIMING
AND
SEMI-MARKOV
135
CHAINS
response times are also distributed in N categories s/N wide. Identify as states in the system the occurrence of a response in any of the 2N categories, with shock occurrence as an additional state. The c matrix then takes the form,
Shock
0 coo
0 1
Cl0
S-R
2
c2n
Latencies
i N
I
S+R
-/
R+R N
1
2
co1
co2
..*
N+l
‘ON
‘NO
‘N 0
‘N+2
0
...
2N
0 ...
Cl N-Cl
0
‘N+l
N;2
0
‘N+l
...
N+l
...
N+l
c1 2N
cN 2N cN+l
2N
0
C2N 0
‘2N
...
N+l
‘2N
2N
Shocks are assumed to regenerate the system so that Cuj , j = 0, I,..., N are constant over successive shocks. They are then defined by c00 = 1 - F,(s),
coj= Fs(js/N) - Fs([j - 11s/N),
j = 1, 2,...,N,
where F, is the timing distribution function for the shock-shock clock. The remainder of the c matrix is constructed so as to allow for two sorts of nonstationarity in the response-shock clock. There is evidence that IRT’s following shocked IRT’s (S + R latencies) distribute differently than those following unshocked IRT’s (Gibbon and Hunt, 1968; Wertheim, 1965). Dependencies have also been demonstrated for timing after responding at different proximities to shock-due time (Wertheim, 1965). To accommodate both sorts of dependency, let F,(t; S + R, i), i = 1, 2,..., N represent the distribution function for timing following an S + R latency in category i, and similarly for F,(t; R --f R,i), i = N + I,..., 2N. The dependence of these functions on i is an approximation to continuity, and it is assumed that N is large enough so that differential dependencies within a category are negligible. The S --f R rows of c are then defined by
F,(r; S + R,i), cij = F,(jr/N; S + R,i) -FJ[j
cio = 1 -
-
I] r/N;
S--t
R,i), i =
1, 2,..., N;
j = N + 1, N + 2 ,..., 2N. The
R + R rows
are defined similarly
F,(t; R --f R,i),
in terms of i = N + 1, N + 2 ,..., 2N.
136
GIBBON
For the discretized time situation it is assumed that the mean transition any of the 2N distributional categories equals the value of the category That is, poj(s) = + (i -- ;I,
PO”@) = s;
0 ;: j
time into midpoint.
. N;
and pJr)
= Y, i > 0;
&Y)
= & (j - N - $,
With these conventions, the asymptotic are then given by the mixture,
i>O,
N
R + R latency distributions
j=o
and
(Figs. 3 and 4)
N
where PR(j; Y) is the probability of observing a post-response latency in category j (or a post-response shock, when j = 0) unconditional on which of the response states begins the interval. If the timing distributions, F,(t; S + R, ;) and F,(t; R + R, i), are all scalar in r then the c matrix is independent of r, and so is u. Now by their definitions, ,Gj(r) = rFj for j # 0, so that in this case (43) becomes, according to U6), =
pR(j;
l),
(44)
which is the scalar property in the r clock. Equation 44 is of course not a surprise since all the dependencies were assumed scalar. A more interesting question, and one unanswered at present, is whether scalar dependencies are a necessary condition for the mixture (43) to be scalar. In the cases we consider here, states in the system are all (in principle) observable, so that (43) may be directly computed from c and p. The more frequent application of mixtures like (43) involves several distinct but, in practice, indistinguishable states and assumptions about c are then tested by their fit with the observed mixture (see, for example, Cox, 1963, or the summary in Cox and Lewis, 1966, pp. 194-204).
CONCLUDING REMARKS The theoretical account of avoidance which has been proposed here is unusual on several counts. First, while it purports to describe learned behavior, nothing substantive about the details of the learning process is proposed. Free-operant
SCALAR
TIMING
AND
SEMI-MARKOV
CHAINS
137
avoidance (and to some extent free-operant schedules generally) have the property that control over when the reinforcing stimuli are presented is shared by the subject and the experimenter. Subjects receive maintenance shocks whenever they, so to say, “need” them. Under these conditions asymptotic behavior must represent a balance between successive reconditioning and extinction of avoidance behavior. The thrust of the account proposed here is that whatever the details of the reconditioning and extinction processes are, the equilibrium eventually reached must be scalar or nearly scalar. The discretized time analysis represents one approach to the equilibrium problem. Another feature of the present account is that much of the mathematics is in some sense unnecessary. To the extent that the proposal is correct, response and shock rates are necessary consequences of the basic Q(t) matrix, and theoretically need not be reported at all. At the very least, data on latency and shock probability in each clock, and the reciprocals of rates deserve observation. On the other hand, the semiMarkov chain analysis is not restricted to the application at hand, and represents a generalization of chainlike latency mechanisms frequently assumed to operate between observable stimulus and response events (e.g., McGill and Gibbon, 1965). In the application put forth here, semi-Markov chains represent a marriage between freeoperant and discrete trial procedures, and may be prototypical of processes which embed discrete events in continuous time.
REFERENCES N. H. Variation of CS-US interval in long-term avoidance conditioning in the rat with wheel turn and with shuttle tasks. ]ournul of Comparative and Physiological Psychology, 1969, 68, 100-106. ANGER, D. The role of temporal discriminations in the reinforcement of Sidman avoidance behavior. Journal of the Experimental Analysis of Behavior, 1963, 6 (suppl.), 477-506. ANSELONE, P. M. Ergodic theory for discrete semi-Markov chains. Duke Mathematics Journal, 1960, 27, 33-40. CANE, V. R. Behavior sequences as semi-Markov chains. Journal of the Royal Statistical Society, 1959, B21, 36-58. CLARK, F. C., AND HULL, L. D. Free-operant avoidance as a function of the response-shock = shock-shock interval. Journal of the Experimental Analysis of Behavior, 1966,9,641-647. Cox, D. R. Some models for series of events. Bulletin of the International Statistical Institute, 1963, 40, 737-746. Cox, D. R., AND LEWIS, R. A, W. The statistical analysis of series oj events. London: Methuen, 1966. Pp. 194-204. FELLER, W. On semi-Markov processes. Procedings of the National Academy of Sciences, 1964, 51, 653-659. FELLER, W. An introduction to probability theory and its applications. Vol. II. New York: Wiley, 1966. GIBBON, J., AND HUNT, H. F. “Buried” discriminations in the acquisition of free-operant avoidance in rats. American Psychologist, 1968, 23, 882. (Abstract)
ANDERSON,
GIBBON
138
D. F. Actual versus potential shock in making shock situations function as negatil-e reinforcers. Jc~urnal of the Experimental Analysis of Behavior, 1968, 11, 385-403. HOWARD, R. A. Semi-Markovian decision processes. Bulletin of the International Stcrtisticrrl Institute, 1963, 40, 625-652. HYMAN, A. Two temporal parameters of free-operant discriminated avoidance in the rhesus monkey. Jozrmal of the Experimental Analysis of Behavior, 1969, 12, 641-648. KAMIN, L. J. Traumatic avoidance learning: the effects of CS-US interval with a trace-conditioning procedure. Journal of Comparative and Physiological Psychology, 1954, 41, 65-72. KEMENY, J. G., AND SNELL, J. L. Finite Markov chains. Princeton: Van Nostrand, 1960. KIELSON, J. On the matrix renewal function for Markov renewal processes. Annals of Mathematical Statistics, 1969, 40, 1901-1907. LEVY, P. Processus semi-Markoviens. Procedings of the International Congress of Mathematicians, (Amsterdam) 1954, 3, 416-426. Low, L. A., AND LOW, H. I. Effects of CS-US interval length upon avoidance responding. Journal of Comparative and Physiological Psychology, 1962, 55, 1059-1061. MCGILL, W. J. Stochastic latency mechanisms. In R. D. Lute, R. R. Bush, and E. Galanter (Eds.), Handbook of Mathematical Psychology. Vol. I. New York: Wiley, 1963. Pp. 309-360. MCGILL, W. J., AND GIBBON, J. The general-gamma distribution and reaction times. Jozrrrral of Mathematical Psychology, 1965, 2, l-1 8. NORMAN, M. F. An approach to free-responding on schedules that prescribe reinforcement probability as a function of interresponse time. Journal of Mathematical Psychology, 1966, 3, 235-268. PYKE, R. On renewal processes related to type I and type II counter models. Annals of Mathematical Statistics, 1958, 29, 737-754. PYKE, R. Markov renewal processes: Definitions and preliminary properties. Annals of Mathematical Statistics, 1961, 32, 123 l-l 242. (a) PYKE, R. Markov renewal processes with finitely many states. Annals of Mathematical Statistics, 1961, 32, 1243-1259. (b) SIDMAN, M. Two temporal parameters of the maintenance of avoidance behavior by the white rat. Journal of Comparative and Physiological Psychology, 1953, 46, 253-261. SIDMAN, M. Avoidance behavior. In W. K. Honig (Ed.), Operant behavior: Areas of research and application. New York: Appleton-Century-Crofts, 1966. Pp. 448-498. SIDMAN, M., AND BOREN, J. J. A comparison of two types of warning stimulus in an avoidance situation. Journal of Comparative and Physiological Psychology, 1957, 50, 282-287. SMITH, W. L. Regenerative stochastic processes. Proceedings of the Royal Society of London, 1955, A232, 6-31. (Abstract of this paper in Proceedings of the International Congress of Mathematicians (Amsterdam) 1954, 2, 304-305.) ULRICH, R. E., HOLZ, W. C., AND AZRIN, N. H. Stimulus control of avoidance behavior. Journal of the Experimental Analysis of Behavior, 1964, 7, 129-l 33. VERHAVE, T. Avoidance responding as a function of simultaneous and equal changes in two temporal parameters. Journal of the Experimental Analysis of Behavior, 1959, 2, 185-190. WERTHEIM, G. A. Some sequential aspects of IRTs emitted during Sidman-avoidance behavior in the white rat. Journal of the Experimental Analysis of Behavior, 1965, 8, 9-15. HAICE,
RECEIVED:
December
15, 1969