JOURNAL
OF MATHEMATICAL
13, 127-147
PSYCHOLOGY
Finite
Automata DAVID
Carnegie-Mellon
University,
Schenley
(1976)
and S-R ModelP E.
KIERA$ Park,
Pittsburgh,
Pennsylvania
15213
Suppes (1969, Journal of Mathematical Psychology, 6) apparently proved that any finite automaton could be mimicked by the asymptotic behavior of a stimulussampling S-R model, broadly implying that cognitive theory could be reduced to S-R theory. Here it is shown that the finite automata used by Suppes are more restricted and less powerful than finite automata in general. Furthermore, the S-R models proposed by Suppes are limited to producing the behavior of only these restricted automata, and cannot mimic all finite automata. Hence, the formal claim that S-R models are adequate to produce all behaviors obtainable from finite automata, and the informal claim that cognitive theory reduces to S-R theory, do not follow from Suppes’s (1969) result. Some alternative S-R models and their problems are also briefly discussed.
To many cognitive psychologists,the debate over the theoretical adequacy of S-R theory was settled long ago, with S-R theory the loser. However, in an influential paper, Suppes(1969a) reopened the issueby apparently showing that an S-R model of the stimulus-samplingvariety could becomeisomorphic to any finite automaton. The ensuing debate between Suppes and Arbib (Arbib, 1969a; Suppes, 1969b) raisedsomeimportant points, but failed to properly clarify the scopeof Suppes’results and its application to cognitive approachesto psychologicaltheory. The significance of Suppes (1969a) lies in the fact that the finite automaton, or abstract machine, is generally accepted as a general description for any deterministic system. If it is assumedthat cognitive processes,and behavior in general, require only a finite memory capacity, and can be viewed at a satisfactory level of analysisas being deterministic, then the finite automaton can be usedto represent any cognitive mechanism.Meanwhile, aspresentedby Suppes(1969a, p. 327), S-R theory is based only on the concepts of stimuli, responses,and conditioned connections between stimuli and responses.If it is true that any finite automaton can be mimicked by the r Thanks are due to D. H. Krantz, J. various versions of the paper. 2 Portions of this paper were presented of Michigan, Ann Arbor, August 28-29, 3 Requests for preprints should be sent Mellon University, Pittsburgh, Pennsylvania Copyright All rights
,O 1976 by Academic Press, Inc. of reproduction in any form reserved.
G. Greeno,
and
G. J. Groen
for helpful
at the Mathematical Psychology Meetings, 1974. to David Kieras, Department of Psychology, 15213.
127
criticisms University Carnegie-
of
DAVID E. KIERAS
128
asymptotic behavior of an S-R model, then it follows that S-R theory is a formally adequate, though possibly clumsy, theory of the learning of cognitive structures. Hence, cognitive psychology would be reducible to S-R psychology. While it must be acknowledged that Suppes made a valuable contribution by applying some of the tools of automata theory to a key theoretical problem in psychoIogy, his claim is incorrect; the S-R models used in Suppes (1969a) can mimic only a restricted class of finite automata. The general claim that S-R theory is formally adequate to represent the learning of any behavior holds only in the following cases: (a) The to-be-learned behavior (represented by a target automaton) can be represented by a member of a restricted class of finite automata; (b) certain very strong assumptions are made that have the effect of changing the learning paradigm; (c) the learning process is assumed to be much more complex than the S-R models of Suppes (1969a). Whether the above limitations on the generality of Suppes’ (1969a) result indicate the total bankruptcy of S-R theory is a partisan matter that will not be argued here. However, an important fact is that S-R theory, in its modern “liberal” form, is flexible enough to meet a criterion of formal adequacy. The formal problem is to distinguish adequate forms of S-R theory from inadequate forms; the empirical problem is then to determine whether these forms can be applied to data in a way that investigators feel is acceptable and useful. The goal of this paper is to clarify the formal issues in order to facilitate sounder discussion of S-R theory and set the stage for empirically based chaises among competing models of both the S-R and cognitive species. The paper will first introduce some basic definitions and theorems from automata theory, followed by an introduction of Suppes’ (1969a) characterization of automata, which differs from the usual definition. S-R models and their asymptotic behavior will then be described, followed by a proof of the limited generality of Suppes’ result and a section on some alternate S-R models.
FINITE AUTOMATA
AND THEIR BEHAVIORAL EQUIVALENCE
The definitions in this section are conventional definitions with only minor notational changes and are available in most current textbooks on automata theory. For fuller coverage the reader should consult Shannon and McCarthy (1956), Gill (1962), Minsky (1967), Arbib (1969b), or similar sources. Many psychologists have had contact with the automata used in linguistic theory, which are specialized or restricted versions of more general forms of automata. The linguistic automaton, or “acceptor,” is given a sequence of symbols and then indicates whether or not the sequence is a valid string in some formal language. Thus, the concern is whether the machine ends up in a particular set of states after receiving a series of input symbols. In contrast, the conventional automaton has a set of unobserved
AUTOMATA AND S-R MODELS
129
internal states that can stand in a many-to-one relationship to a large number of observed responses.The notation is defined asfollows. DEFINITION 1. M = (S, Q, R, 6, /3) is a$nite automaton(OTmachine)iff S is a set of inputs, or stimuli, R is a set of outputs, or responses,and Q is a set of internal states, where S, Q, and R are finite sets.At time, or trial, t, the state transitionfunction 6:
s
X
8
-
8:
ht
7 C-d
I-+
4k.t
3 si E s,
4j > qk E
8
assignsto each combination of present stimulus and previous state a new internal state.* The state output function
specifiesthe present responsefor the state. The reader may recognize this asthe definition of the “Moore” machine. There is also the “Mealy” machine, which instead of a state output function has an output function X: S x Q + R taking a present stimulus-previous state combination to the responses,thereby associatingoutputs with transitions between states, rather than to the states themselves.The two representatationscan each be translated into the other; the Moore form is used here only for convenience. The natural application of the finite automaton to psychological experimentation is as follows: The experimenter defines the set S by performing certain specified manipulations of the subject’s environment and the set R by observing and recording certain selected aspects of the subject’s activity. Any unspecified environmental influences, unobserved responsecomponents, and strictly internal factors all enter into the definition of the statesetQ. For example, in a concept identification experiment the stimulus set is the various test items, and the responseset consistsof the categorizing responses,typically a positive responseand a negative response.The states of knowledge of the subject, or alternatively, the states of processingof the subject’s congitive mechanisms,make up the internal state set. As the task proceeds,the subject moves from one state of processingto another; the emitted responsedependson both the stimulus and the internal state. It is important to point out that the main value of using an automaton to represent behavior lies in the clear distinction between observed behavior and the unobserved events underlying behavior. For this reason, it would be pointless to include in the stimulus set and the responseset events that are actually unobserved. This meansthat entities such as implicit responsesare in fact involved with changesof state, and are not membersof the responseset R. 4 The notation f :X f:x~+ymeansthatfmapsx~Xtoy~Y.
Y indicates
that
the function
f maps
the set X to the set Y, while
130
DAVID
FIG.
1.
E. KIERAS
Machine
Ml.
An automaton can be described either by tables specifying the functions 6 and & or more conveniently by means of diagrams, such as Fig. 1, which describes machine Ml. The circles represent states and the arrows represent transitions between states. The digit at the tail of each arrow indicates which input symbol cause the specified transition. The states are labeled q1 , 4s , q3 , and the digits in the circles specify the output when the machine is in that state. Thus, if the input is a long sequence of 1s the output will settle into a repeating pattern: . ..I 101101101 I... . If the input is always 0, the output will soon become always 0 as well. Other input sequences will produce more complex output patterns; note that the exact output pattern depends on which state the machine is started in. The following definition, which will be used throughout this paper, provides a general way to describe the input-output behavior of a machine: DEFINITION 2. The behavior function of a machine M, B(M, q, s) = Y iff q is the initial state of M, s is a sequence of stimuli (an element of S*, the set of all possible finite sequences of stimuli including A, the empty sequence with length zero), and t is the response produced by M after receiving all stimuli in s, starting in state q. Computing B(M, q, s) requires applying the state transition function recursively, starting with q and the first element of s, and concluding by applying the state output function to the final state. Thus, for s = s,s,ss , and initial state q,, ,
BP5
40 9 4 = f%Y% 9 % Y%I 9 !?ow
The next two definitions further characterize single machines. DEFINITION
3. States q, q’ EQ in machine A4 are equivalent iff for all s E S*, B(M
q, 4 = BP’,
q’, 4.
AUTOMATA
FIG. DEFINITION
4.
AND
2.
S-R
Machine
MODELS
131
M2.
Machine M is in reduced form iff q, q’ E Q being equivalent implies
that q = q’. Notice that q being equivalent to q’ does not imply that q = q’ or that 6(s, q) = S(s, q’) for all s E S. Rather, it means that in principle it is impossible to determine in which of the two states M was started by means of any tests on the machine’s behavior, even though the two states are structurally distinct. For example, machine M2 (Fig. 2) has two equivalent states, q1 and q2 ; starting the machine in one will yield precisely the same behavior as starting in the other. M2 may be reduced easily by combining ql and q2 into a single state. A machine is reduced if it has no distinct equivalent states. For example, Ml (Fig. 1) is reduced, since starting in each state will yield a response sequence which for some stimulus sequence is different from that produced by starting in any other state. The concept of equivalence can also be defined as equivalence between two machines: DEFINITION 5. Machines M = (S, Q, R, 8, ,!I) and M’ = (8, Q’, R, 8, 8’) equivalent iff for every q E Q there exists a q’ E Q’, and vice versa, such that
WK
q, 4 = Vf’,
are
q’, 4
for all s E S*. Thus, two machines are equivalent in the case that one can fmd corresponding initial states that result in the two machines displaying identical behavior. It is important to note that two machines can be equivalent only if they have the same set S of stimuli and set R of responses. Furthermore, machine equivalence is an equivalence relation, meaning that (a) M is equivalent to itself; (b) if M is equivalent to M’, then
DAVID E. KIERAS
132
M’ is equivalent to M; (c) if M is equivalent to M’ which is equivalent to W’, then M is equivalent to M”. Equivalent machineshave the samebehavior; machinesthat have essentially the same structure are equivalent and have exactly corresponding states. DEFINITION 6. Machines M = (S, Q, R, 6, p) and M’ = (S, Q’, R, 8, /I’) are isomorphiciff they are equivalent and differ only in the labeling of states, i.e., there exists a l-l onto map h: Q + Q’ such that for all 4 EQ and s E S*,
WC
4, 4 = Wf’,
h(q), 4.
This section concludeswith two basic theorems that will be stated without proof, since they can be found in many automata theory textbooks. THEOREM
1. For every machineM, there exists an equivalent machinein reduced
f arm. THEOREM 2. If machinesM and M’ are both in reducedform and are equivalent, then M and M’ are isomorphic.
These two theorems together state that we can find a unique (except for state labeling)reducedequivalent machinefor any given machine.Since machineequivalence is an equivalence relation, we can deal with whole classesof machinesby working with only the unique reduced forms.
SUPPES' CHARACTERIZATION OF AUTOMATA
Suppes’(1969a) result is basedon a particular characterization of automata and an application of them to learning situations. Caution is necessarywhen considering Suppes’ use of automata. Starting with the linguistic automata mentioned above, Suppesexplicitly statesthat responsesare to be identified with internal states:“internal state” and “response” are to be used interchangeably (Suppes, 1969a, p. 329). In terms of Definition 1, this statementmeansthat the state output function p is restricted to being l-l and onto; for every state there is a unique response,and vice versa. This restricted characterization of automatais at variance with the broader psychological interpretation of Definition 1 discussedabove. Note that in an automaton restricted with the /3 function l-l and onto, the state set is actually superfluous and can be replaced with the responseset; in a real sense,such automata do not need a state set. To promote clarity, Suppes’ characterization of automata will be presented asa separatetype of automaton, simplified by the removal of the state set:
AUTOMATA
AND
S-R
133
MODELS
DEFINITION 7. X = (S, R, y) is an S-machine iff S is a finite set of stimuli, R is a finite set of responses, and at each time, or trial, t, the function
y: S
X
R + R: (~i,t 9 rj,t-1) w rk,t 9 Si E Sy ri 9 rk E Rp
assigns to each combination of current stimulus and previous response a new response. TABLE Sample
1
S-Machine
The S-machine may be described by a table that lists every combination of present stimulus and previous response and specifies a present response. An example used by Suppes (1969a) app ears in Table 1. If the previous response is rl , si produces r, . Applying sa causes the machine to change “state,” that is, the response r2 will cause future applications of s, to produce r2 . Applying sa again will flip the “state” back to the original condition by eliciting rl . The important property of an S-machine is that it has memory in the sense that its behavior can depend on events previous to the current stimulus. Hence the Smachine is more powerful than a simple S-R map m: S + R. However, one would suspect that the S-machine is less powerful than the finite automaton, since the Smachine’s memory is limited to that of its last response, whereas the memory capacity of the finite automaton depends on the number of states and is not tied to the last response. This suspicion is in fact the case, as will be shown by the following comparison of S-machines and finite automata.
S-MACHINES
AND
FINITE
AUTOMATA
COMPARED
A simple extension of Definition 2 allows us to talk about the behavior of an Smachine in a way analogous to the behavior of finite automata: DEFINITION 8. The behavior function of an S-machine X = (S, R, y), B(X, r. , s), is the response produced by X after receiving all of the stimuli in the sequence s E S*, when the initial response was r,, .
134
DAVID
E.
KIERAS
Using this definition, we can define equivalence between finite automata and S-machines that have the same sets S and R. Since the S-machines has responses where a finite automaton has states, the following is the natural definition: DEFINITION 9. An S-machine X = (S, R, r) and a finite automaton M = (S, Q, R, 6, /3) are equivalent iff for all r E R there exists a 4 E Q, and vice versa, such that for all s E S*
For what follows, it will be necessary to know how to construct an equivalent finite automaton for any S-machine, and also how to construct an equivalent Smachine for any restricted finite automaton. The following theorem establishes this relationship: THEOREM 3. (a) Let M = (S, Q, R, 6, ,!I) be a Jinite automaton with j3 I-l and onto. The S-machine X = (S, R, y), with y(s, r) = /3(S(s, B-‘(Y)) is equiwalent to M. (b) Let X = (S, R, y) be an S-machine. The automaton M = (S, Q, R, 8, j3) with /3 1-I and onto, and 6(s, q) = 8-l (y(s, p(q))), is equivatent to X.
Proof.
(a) Let s’ = srsa ... s, , and r,, and q,, be such that B(X,
Since y(s,
Y)
= p(S(s,
yo
r.
= p(qO). Then
, s’) = r(sn ,*‘., Y(Sl t y(J).
/F(Y)),
B(X
70, s’) = p(s(s, ,..., 8-‘@(~(s1 TP(~oNN)) = rw(sn >*.*9Ql 7 40))) = B(M, qo , s’).
Thus, X is equivalent to M. (b)
Using the same definitions of s’, q. , and B(M
Since +, n> = P-W, WC
ro,
the behavior of M is
qo 9 s’) = kW(sn ,..-, @I 9 qo))).
P(q))), qo 7 s’> = IV-Wn ,..-p N+Yr(s~ = Y(% >***>‘y(Sl 7 yo)) = B(X, Yo ) s’).
7B(qoN)NN
Thus, M is equivalent to X. Furthermore, reducing an unreduced machine entails decreasing the number of states; since /I is l-l onto, Q already haa the minimum number of states. Hence, M is reduced.
AUTOMATA
AND
S-R
135
MODELS
The key problem with Suppes (1969a) is that his characterization of automata, the S-machine, is not general, that is, automata with the restriction that /.I is l-l onto, and their S-machine equivalents, are less powerful than automata not so restricted. This can be shown easily: THEOREM
4.
NONEQUIVALENCE
OF
S-MACHINES
AND
FINITE
AUTOMATA.
If
(&Q,RhP)
is such that the reducedform of M, M’, hasi7 many-to-one,then there existsno S-machineX = (S, R, y) that is equivalentto M. M
=
Proof. Suppose the equivalent S-machine X does exist; let M” be its reduced equivalent constructed according to Theorem 3; X will be equivalent to M’ as well as M. Since M” and M’ are equivalent and are both reduced, they must be isomorphic (Theorem 2), meaning that there exists a l-l onto state relabeling function h: Q’ + Q”. Then for Q’ E Q’, /3”(h(q’)) = jl’(q’). Since /I’ is many-one, there exists ql’, q2’ E Q’, qr’ # qa’, such that /3’(qr’) = Of. H owever, since h is l-l onto, h(q,‘) # h(q,‘), and since j3” is also l-l and onto, we have the contradiction: 8’(q;) = B”(h(q,‘)) # 8”(h(q,‘) = sl(q;) = fi’(q:). Thus, M” cannot exist, which implies that X cannot exist. This theorem means that given a set of stimuli and a set of responses, a large class of behaviors can be obtained with unrestricted finite automata that are impossible to produce with an S-machine. These behaviors are those requiring that the reduced state set be larger than the response set. Automata with a many-one output function must not be regarded as bizarre or pathological; indeed, they are more representative of ordinary behaving systems, both animals and machines, than S-machines. For example, the ordinary computer can have many billions of states, yet its output at any point in time can be one of only a few symbols. In a psychological context, a subject in a Sternberg experiment makes only a binary response, but his internal state space is extremely complex, representing the stimulus list, the test item, and supporting a complicated memory scanning mechanism. In the experiments performed by Restle and Brown (1970) the subject makes distinct responses consisting of pressing one of six buttons and learns a sophisticated sequential pattern in which each button can be used several times in the same pattern. Only by maintaining separate state information is it possible to distinguish between the different appearances of each response. Thus, it would appear that behaviors requiring many-one state output functions are the rule, rather than the exception; S-machines do not appear to be general enough to represent most behaviors of interest in psychology. The following sections show how the S-R models used by Suppes (1969a) are limited to producing the behavior of S-machines.
136
DAVID
S-R
MODELS
AND
E. KIERAS
THEIR
ASYMPTOTIC
BEHAVIOR
What Suppes (1969a) actually proved was that an S-R model of the stimulussampling variety could become asymptotically isomorphic in its stimulus-response connections to any S-machine. Since automata exist that have no S-machine equivalent, it appears that these same automata could not be mimicked by these S-R models. If this is true, then Suppes’ proof is of limited scope; we do not yet have a formalization showing under what conditions an S-R model can produce the behavior of any finite automaton. In order to show this limitation of Suppes’ result, and prepare for more general discussion of S-R models, it is necessary to characterize S-R models appropriately. The definition of S-R model to be presented here is based on the approach used by Suppes (1969a), with two main differences: (a) As presented by Suppes, an S-R model does not produce any behavior in particular; rather the asymptotic behavior is determined by the reinforcement schedule. Rather than separating the model and the target behavior, the definition used here will include both the model assumptions and a representation of the reinforcement schedule that defines the target behavior. (b) Suppes described S-R models in terms of becoming “isomorphic” to automata. Since an S-R model essentially creates a map of the stimulus-response assignments, the term “isomorphic” is appropriate only if the other system also has a maplike structure; a finite automaton is not conveniently described in such terms. Furthermore, isomorphism implies a similarity in structure; what we are really interested in is a similarity of behavior, or behavioral mimicry. Hence, S-R models will here be described in terms of becoming asymptotically equivalent to automata. The first definition in this section is a restatement of Suppes’ (1969a) stimulussampling theory definition of S-R models: DEFINITION
10.
Z =
(2,
R, p) is an S-R modeliff:
1. .Z is a finite set of eflective stimuli; R is a finite set of observed responses; function, p: Z --f R, such that Y, = p(u) is the correct response for effective stimulus 0 E 2.
p is a reinforcement
2. On each trial, each stimulus element CT E Z has some positive probability of being presented. 3. The single presentedeffective stimulus (Tis sampledon each trial. 4. Each effective stimulus may be conditioned to only one responseat a time. 5. There is a fixed positive probability c that effective stimulus 0 will be conditioned to the correct response,yc = p(u), on each trial, if not already so conditioned. 6. Once an effective stimulus is conditioned to a response,it remains so conditioned.
AUTOMATA
AND
S-R
MODELS
137
7. Conditioning of unsampled (unpresented) stimuli does not change. 8. If a response is conditioned to an effective stimulus, that response is made to the stimulus with probability one. 9. If no response is conditioned to an effective stimulus, each response has some positive probability of being made. An important point is that the stimulus set L’ in the above definition is defined as an effective stimulus set, which in a particular case may not be synonymous with the observed stimulus set S. If this provision were not made, an S-R model obviously would be limited to producing behaviors of the strict S-R map form, m: S -+ R; it is clear that behavior is more complex. Generally it will be assumed that the effective stimulus specifies the observed stimulus, that is, a function exists, h: Z-t S, that assigns an. observed stimulus to each possible effective stimulus. Since this function might be many-one, knowledge of S may be inadequate to predict R; however, knowledge of the effective stimulus should be adequate, since the reinforcement function p: .E-+ R defines the target behavior; the correct behavior can be defined in terms of a specification of the correct response to each possible effective stimulus. The next definition formalizes the criterion of learning in an S-R model: DEFINITION 11. An S-R model Z = (Z, R, p) learns iff for trial t, and any CTE 2,
Thus, an S-R model learns in the case that its responses to effective stimuli become identical to those specified as correct by the reinforcement function. In order to compare the asymptotic behavior of S-R models to finite automata, we need a method of comparing the behavior of the two systems that takes into account the fact that an automaton’s response can depend on the preceding sequence of stimuli, rather than only the single current stimulus. Further, since the effective stimulus for the model, and the observed stimulus that the automaton is given, may not be the same, the comparison has to use corresponding sequences of effective and observed stimuli: DEFINITION 12. An S-R model Z = (Z, R, p) becomes asymptotically equiwalent to an automaton M = (S, Q, R, 6, p) iff: 1. There exists a function h: ,?Y+ S, specifying the observed stimulus for each effective stimulus, such that on trial t, for any s’ E S*, s’ = spt+r -.a s~+~, there exists a sequence of u’ E .P, u’ = utut+r *.*~~+,,suchthath(u~) =si,is{t,t+ l,..., t+k}. 2. The asymptotic behavior of Z is that of M: vz +i.t+7c I 4
= 1
iff
B(M, 4, s’) = ri for some q E Q.
138
DAVID
E. KIERAS
This definition means that an S-R model behaves the same as an automaton, in the limit, in the case that, given sequences of corresponding effective and observed stimuli, the model and the automaton produce the same response after receiving their stimulus sequence. Since we do not require that the model learn the original initial state of M at t = 0, an appropriate state of M may be chosen, which corresponds to the model “state,” at the time of the comparison. Hence, the second condition in the above holds for some 4, rather than a specific state of M. In what follows, comparisons are also going to be made between S-R models and S-machines. Rather than define asymptotic equivalence and prove its properties separately for S-machines, it will suffice to point out that any results involving asymptotic equivalence for finite automata can be simply extended to S-machines, by appropriate use of Definition 9 and Theorem 3. Hence, results for finite automata will be applied to S-machines without further explanation. Notice that unlike simple machine equivalence (Definition 5), asymptotic equivalence is not an equivalence relation. However, if an S-R model is asymptotically equivalent to some automaton, then it is also asymptotically equivalent to any other equivalent automaton: THEOREM 5. Let Z = (Z, R, p) be asymptotically equivalent to somefinite automaton M = (S, Q, R, 8, /3), Let M be equivalent to another machine M’ = (S, Q’, R’, 6, /3’). Then Z is also asymptotically equivalent to M’.
Proof. For any sequence s’ E S*, there is a sequence u’, meeting the requirements of Definition 12, such that
pz wi,t+kI4 = 1 Since B(M’,
iff
B(M, q, s’) = ri .
q’, s’) = B(M, q, s’) for some q’ and any s’,
lj+zp(li.*+k I 4 = 1
iff
B(M’,
q’, s’) = ri ,
and Z is asymptotically equivalent to M’. Suppes (1969a) proved his main result in one large and rather complex theorem. Here the approach will be more piecemeal. Those familiar with probabilistic learning models will readily see that models of the form of Definition 10 work, that is, given a consistent reinforcement schedule, the model will eventually form the correct connections. Since this final behavior of the model corresponds to a reinforcement function that can be tied to the observed stimuli, it is fairly easy to show that this final behavior is that of some finite automaton; this automaton can be defined in terms of the reinforcement function and the function linking effective to observed stimuli. The following theorem provides these results:
AUTOMATA AND S-R MODELS
139
EQUIVALENCE LEARNING IN S-R MODELS. Let 2 = (2, R, p)be an S-R model, and let h: 22-t S exist. Then (a) 2 learns; (b) 2 becomes asymptotically THEOREM 6.
equivalent to somejnite automatonM = (S, Q, R, 6, /3). Proof. (a) Cf. Suppes (1969a) and Atkinson, Bower, and Crothers (1965, Chap. 8). Let C be a set of conditioning states, where C,,, denotes that K elements of Z are conditioned to their correct responses on trial t. Note that C is finite, with m = #(Z) elements. Let C’s be the initial conditioning state. From Definition 10, P(G.t+1
It follows
I c,,t> = 0, =c>o,
iff
i>j+
1 ori
iff
i=j+
l,i#m
=1-c,
iff
i=j,j#m
zzz1,
iff
i=j=m.
that
;+% P(C,,, = C,) = 1, meaning
that in the limit, ‘,‘y Ply, = P(d
the responses 4
= 1
of the model are correct:
iff F-2 P(Y,,~
1 q)
1
=
iff
ri = p(u);
hence, Z learns. (b) It must be shown that there exists an M such that for any s’ E S*, with s’ = St ..* St+lc , there is a u’ = Us -.. utfk , h(uJ = si , i ~{t ,..., t + K), such that &
Wi*t+k
I4
= 1
iff
B(M,
qO, s’) =
ri
, for some 4s E Q.
First, consider that ~(0,) does not depend on the prior stimuli a,-, , ut-s ,... ; however, = p(u,)j ut) can depend on prior stimuli, since the probability that a is conditioned to the correct response depends on the number of prior presentations of u (and of a correct corresponding reinforcements of p(u)). But, in the limit, the probability response does not depend on the prior stimuli. Thus,
P(y,
lii
P(Y,
=
p(q)]
U&i
... ut-pt)
iff li~i P(r, = p(q)] q) = 1
=
1
iff FiE P(r, 1CT~) = 1
iff p(uJ = rt . Hence, for a sequence
U~‘T$+~ **a u~+~ ,
lj+z p(yi.t+k I ut ... %d = 1
iff
p(~,+~) = ri .
140
DAVID
E. KIERAS
Thus, it sufficesto show that an M exists such that for someqO and any s’, iff
pt P(ri.t+lcI ut+lc)= 1
B(M,
q,, , s’) = ri
M can be constructed as follows: Let #(Q) = #(Ii), For each q EQ, and u E Z, define 6 as
iff
p(ut+J = ri .
and /3: Q -+ R be l-l
onto.
W(u), a> = BYP(4). Then the behavior of M on sequences’ is Wt
qo, 4 = ,&V(ut+~J>-v W(4,
qo)))
= B(P-‘(P(ut+d) = P(Ut+?c) = *i 9 by the above definition of 6. Hence, with this definition of M,
F+z qri.t+rI4 = 1
iff
B(M,
q. , s’) = ri ,
and Z is asymptotically equivalent to M. Now a restatementof Suppes’(1969a)main result can be supplied.Suppesstipulated that the effective stimulus set was the set S x R, consisting of compound stimuli madeup of the current stimulus and the previous response;the reinforcement schedule was defined by an S-machine. Suppes proved that the S-R model would become asymptotically isomorphic in its stimulus-responseconnections to the S-machine. The corresponding result, in the terms used here, is that an S-R model, with these definitions of Z and p, becomesasymptotically equivalent to the S-machine: THEOREM 7. SUPPES’ RESULT. Let X = (S, R, y) be an S-machine, and let Z = (Z, R, p) be an S-R model with z = S x R, ot = (st , rtVI), and p(s, r) = y(s, Y). Then (a) Z learns; (b) Z is asymptotically equivalent to X.
Proof. (a) Notice that Z meetsthe requirements of Theorem 6; the definitions of 27and p play no role. Hence, the proof that Z learns is immediate.
(b) Let h: Z-+ S be such that h(u,) = h(s, , rt-J = st . From Theorem 6 we have that Z is asymptotically equivalent to a machine M; using #(Q) = #(RI), and /I: Q + R l-1 onto, for each q EQ and u Ez, the definition of S is VW,
a) =
W,
~1, 4) =
P-YPW
= P(P(s,
r)>.
The behavior of this constructed M, starting in someq. on any sequences’ = slsz . s, is WK qo90 = ,W(s, 3 %-1 ,... 9s(s, , qoo)))).
AUTOMATA AND S-R MODELS Substituting
the above definition Yt = P(St , It--l)
of 6, and noting =
Y(St > yt-1)
=
141
that for any t, Y(St-1
Y.‘.> Y(% 9 YON,
we have for r. = /3(&,
Hence, M as constructed is equivalent to the S-machine X; by extension of Theorem 5, this implies that 2 is asymptotically equivalent to X. Theorem 7 means that if the effective stimulus includes information specifying the last response, an S-R model can learn the behavior of the class of automata equivalent to S-machines. This next theorem proves the fact alluded to earlier. Since S-machines are less powerful than finite automata, the S-R model with these definitions of .E and p cannot mimic all finite automata: OF S-MACHINE S-R MODELS. Let M be a$nite autoTHEOREM 8. LIMITATION maton;such that its reducedform, M, , has /I7 many-to-one. Then there exists no S-R model Z = (Z, R, p), with Z = S x R, at = (st , yt-J, that becomesasymptotically equivalent to M.
Proof. According to Theorem 6, Z becomes asymptotically equivalent to some automaton M’. Suppose that Z did become asymptotically equivalent to M, Theorem 5 means that M and M’ would be equivalent. By Theorem 7, M’ is in fact equivalent to an S-machine X, which would have to be equivalent to M. However, by Theorem 4, if the reduced form of an automaton has i3 many-to-one, there is no S-machine that is equivalent. Hence, X cannot be equivalent to M, which means that M’ cannot be equivalent to M either. Thus, Z cannot be asymptotically equivalent to M.
SOME
ALTERNATIW
S-R MODELS
Suppes’ (1969a) effort proved that S-R models could produce a more complex range of behaviors than simple S-R maps; however, the main claim of Suppes is incorrect; finite automata exist that cannot be mimicked by S-R models of the form ((S x R), R, p). This section discusses some alternate forms of S-R models, in order to clarify what is required to obtain an S-R learning model with the power to learn any automaton. To prepare for this discussion, it is necessary to be somewhat pedantic, and present a statement about what S-R theories are. For the purposes of discussion, S-R theory can be portrayed as resting on three assumptions: 480/x3/2-2
142
DAVID
E. KIERAS
ASSUMPTION 1. The units of analysis are stimuli, connections between stimuli and responses.
responses, and conditioned
2. The behavior to be learned is specified by means of reinforcement of particular responses made to particular stimuli. ASSUMPTION
3. Conditioning is governed by a simple set of uniform laws. The first assumption has long been broadened by liberal S-R theorists (e.g., Miller, 1959) to include unobserved events such as implicit responses and internal stimuli. The implicit responses provide an acceptable means of representing internal states; the effective stimulus is a compound of the current stimulus and an implicit response representing the previous state. The effective stimulus produces both an overt response and a new implicit response representing the new state. Such a system is clearly isomorphic to the finite automaton and would be capable of the same range of behaviors. The problem for the S-R theorist is to account for how such a set of connections can be established within the confines of the second and third assumptions. The second assumption means that every behavior should be teachable merely by indicating the correct responses to individual stimuli in some training procedure. An immediate problem is that unobserved events are involved; the correct response can depend on an effective, rather than observed, stimulus, and the correct response contains an unobserved state-representing component. The teacher must either be able to observe the supposedly unobserved events, or else have some training procedure that will work on the basis of only the observed events. The third assumption is the one that most definitely distinguishes the spirits of S-R and cognitive theory. S-R theory has always adhered to a standard of strict parsimony in learning principles; the fundamental nature of learning is assumed to be simple and general enough to permit capturing it in the form of a few simple and general laws. Conditioning acts in a uniform way, without regard for the internal or external status of the paired events, or the role of the conditioned connection in the overall behavior. On the other hand, cognitive theories hold either that the processes involved in learning are general, but extremely complex, or that there are no general properties of learning; perhaps the underlying processes involved in the way the subject learns a particular behavior depend in a crucial way on the sensory modalities, the type of material, the amount and nature of past experience, the nature of the task, and so forth. Hence, the most important characteristic of a good learning theory in the S-R spirit is parsimony; to the extent that we can account for learning with a simple and uniform set of conditioning-like rules, modern S-R theory is a valuable theoretical alternative. Given the importance of the third assumption, the alternative S-R models discussed here are restricted to those that have this character of simple and uniform learning principIes. ASSUMPTION
AUTOMATA
AND
S-R
MODELS
143
Completebehavior mapping. The behavior of an automaton M is completely specified by a list of values of B(M, qO, s), for all s E S*. The first alternative model to be considered is the modified S-R model with Z = S* and p(u) = B(M, qO, a). The problem with this model is that S*, the set of all possible finite stimulus sequences, has an infinite number of members. Hence, the set of conditioning states in the S-R model is infinite; for a particular target behavior, there will always be some effective stimulus not yet conditioned to the correct response. Clearly, this model is subject to the classic criticism that an S-R model cannot produce correct behavior to a novel stimulus. Partial behavior mapping. There is a theorem of automata theory (see Arbib, 1969b, p. lOIf?.> that an automaton can be specified in terms of its behavior by a list of B(M, qO, s), where s E S[ Osn), the set of all sequences with length less than m, where m depends only on the number of states in the reduced form of M. This fact suggests that a model with Z = S[“*m) might work, since the finite set 2 together with the corresponding values of B(M, q. , u) contains all of the information necessary to specify M. This type of model would work for those automata whose responses can be predicted from knowledge of the last m stimuli, but there are automata without this property. Some finite automata, which can be very simple, have the property that knowledge of a stimulus arbitrarily far back in the past may be required to predict the machine’s response (cf. the “infinite memory” finite automata in Gill, 1962). Hence, storage of some finite number of past stimuli does not always suffice to determine the correct response, meaning that an S-R model based on this approach would not be general. State-reinforcementmodel. This model is general, but exacts a high price for its power. Consider the modified S-R model 2’ = (2, R”, p), in which R” = R x R’, consisting of pairs of overt responses and state-representing responses in the set R’, and .Z = S x R’, consisting of pairs of stimuli and implicit responses. Let M = (S, Q, R, 6, /3) be a finite automaton, and let h: Q -+ R’ be a l-l onto map. Define p as P(St > 61
) = ([/qq, , h-l(r;-_,)))l,PY% 3h-‘K-JNI) = (yt, r,‘).
From the proofs of part (a) of Theorems 6 and 7, it is clear that this S-R model will learn to produce the correct compound response to the effective stimuli. For the sake of brevity, the complete proof that 2’ becomes asymptotically equivalent to M will not be given here, but will be briefly outlined, based on the above definitions and theorems. The definition of asymptotic equivalence must be changed to take into account that the 2 responses are compound, of the form (r, r’), whereas M produces only the simple response r.
144
DAVID
The second condition of Definition lim P(r&+, t-i@=
E.
KIERAS
12 becomes
j a’) = 1
iff
B(M, 9, s’) = hR(r;),
where h, : R” + R: (r, Y’) F+ Y simply maps a compound responseto the specified overt response.Making use of a portion of the proof of Theorem 6, it sufficesto show that pc P(r&+,
iff
1 a’) = 1
p(crt+& = r:
iff lim iff
P(r;,+,
/ ut+J = 1
B(M, 4,, , s’) = hR(rf), for someQ,,.
Since 2’ learns, and from the above definition of p,
it follows that 2’ is asymptotically equivalent to M. Thus, the state-reinforcement model is an S-R model that can indeed mimic any finite automaton. Notice that the model assumesthat reinforcement is applied not only to the overt responses in setR, but alsoto the state-representingimplicit responses in set R’; psychologically, this meansthat the learner is getting information not only about what he is to do in responseto a stimulus, but alsoabout what he is to remember until the next stimulus. In his analysis of the learning of addition, Suppes (1969a, p. 350ff.) makesuse of the state-reinforcement approach by assumingthat the learner verbalizes whether he is carrying a digit to the next column; the teacher corrects this verbalization aswell as the actual written responses. It is important to emphasizethat although the state-reinforcement model is able to learn any automaton, it doesso by virtue of the fact that the nature of the learning problem has been changed; the unobserved events have been rendered into observed events. Rather than training the model in terms of a correct r E R for each s ES, we are actually training the model to produce the correct Y” ER x R’ for eacho E S x R’. It is an extremely strong claim that the learning of an arbitrary automaton requires conditioning basedon reinforcement of state-representingresponses;on the face of it, this claim appearsto be false. The subjectsin the Restle and Brown (1970) studies, involving the learning of a complex sequenceof push-button responses,may have been emitting state-indicating responsesas they worked, but the training procedure never reinforced such responses.Subjects were never told whether they were in the appropriate state during the sequence.Of course, if the subject finds himself making
AUTOMATA
AND
S-R
MODELS
145
a wrong response based on apparently correct earlier learning, he could conclude that he made an incorrect state transition at some point in the past. Likewise, if he produces correct responses consistently, he could assume that his transitions were correct. However, this delayed and incomplete feedback would require complex processing to be useful and cannot be considered as simple reinforcement. Hence, state reinforcement does not seem to be required to learn a complex behavior. On the other hand, the existence of a simple state-reinforcement model suggests that subjects may find learning especially easy when state reinforcement is available. Sample and trial models. The state-reinforcement model is based on reinforcements of both explicit and implicit responses. A good question is under what conditions a model in the S-R spirit could learn any automaton when provided with reinforcement defined only in terms of the observed stimuli and responses. Clearly, the conditioning process will be more complex in this case; with enough complexity, a conditioning model would amount to a cognitive model. What remains to be explored is a great variety of relatively simple learning assumptions that involve a process of sampling and trial of candidate automata. The simplest model of this sort would sample a particular finite automaton from the set of all possible finite automata (a denumerably infinite set), according to some scheme such as in order from the simplest to the most complex, and use the sampled automaton to produce the responses. Upon an error, the system resamples; eventually a correct automaton will be sampled (cf. Gold, 1967). Another example of a sampling and trial model is the “simulated evolution” model of Fogel, Owens, and Walsh (1966). A n initial automaton is randomly mutated to produce a set of offspring automata which are tested in the task environment. The one producing the best performance is the fittest, which survives to produce the next generation of automata for trial. A model more in accord with S-R theory would construct an automaton by means of conditioning individual connections between stimuli, implicit responses, and overt responses. Suppose that on each trial, the model choose a state transition and an overt response, either at random, or according to preexisting connections. If the overt response is correct, the connections representing the transition and response production are conditioned, if not already present. If the response is incorrect, all of the established connections are removed and the model reverts to its original learning state. This assumption is clearly absurd; however, it is well known that under certain conditions, such zero-memory assumptions can work fairly well, as in the early concept-identification models (Restle, 1962). Consider the same model, only with a more realistic assumption on the effect of errors: Upon making an incorrect response, the model removes the preexisting state transition and response connections, if present, that were used on that trial. The model removes one set of connections at a time, thus allowing the piecemeal removal
146
DAVID
E. KIERAS
of incorrect earlier learning. It is not clear that this model would always work; in order to ensure learning, the model would have to be able to arrive at a set of correct connections from any set of connections that it might acquire in the course of learning. Whether the model would always achieve a correct set, or get trapped in some loop, would depend on the nature of the training sequence in some intricate way. The preceding brief examination shows that there is much work to be done in exploring various simple rules governing the installation and removal of connections and characterizing the companion training sequences that will ensure successful learning. CONCLUDING
SUMMARY
Suppes’ (1969a) result that S-R models can produce any behavior obtainable from finite automata has been shown to be limited to those finite automata that can be viewed as S-machines, that is, have a one-to-one state-to-response map. Hence, the general statement that cognitive structures can be acquired by means of conditioning operations based on the observed stimuli and responses does not follow from Suppes’ result. However, it is the case that any automaton can be constructed by an extended S-R model if auxiliary state-indicating responses and corresponding reinforcements are available. Furthermore, it is possible that there exist plausible alternative learning models compatible with the S-R spirit that can learn any automaton given only reinforcements of the observed responses. Thus, the general question of the existence of simple general models for learning that are based on conditioning-like principles is still open. Since a connection principle of the conditioning or associative variety currently stands as one of the few candidates for simple and general principles underlying learning, further exploration of the formal nature of such principles would be a valuable addition to the theory of learning. REFERENCES ARBIB, M. A. Memory limitations of stimulus-response models. Psychological Review, 1969, 76, 507-510. (a) ARBIB, M. A. Theories of abstract automata. Englewood Cliffs, N. J.: Prentice-Hall, 1969. (b) ATKINSON, R. C., BOWER, G. H., & CROTHERS, E. J. An introduction to mathematical learning theory. New York: Wiley, 1965. FOGEL, L. J., OWENS, A. J., & WALSH, M. J. Artificial intelligence through simulated ewolution. New York: Wiley, 1966. GILL, A. Introduction to the theory of finite-state machines. New York: McGraw-Hill, 1962. GOLD, M. Language identification in the limit. Information and Control, 1967, 10, 447-474. MILLER, N. E. Liberalization of basic S-R concepts; Extensions to conflict behavior, motivation, and social learning. In S. Koch (Ed.), Psychology: A study of a science. New York: McGrawHill, 1959. Vol. 2.
AUTOMATA MINSKY,
M.
L. Computation:
Finite
AND
and infinite
S-R machines.
147
MODELS Englewood
Cliffs,
N. J.: Prentice-Hall,
1967. F. The selection of strategies in cue learning. Psychological Review, 1962, 69, 329-343. F., & BROWN, E. Organization of serial pattern learning. In G. H. Bower (Ed.), The psychology of learning and motivation. New York: Academic Press, 1970. Vol. 4. Pp. 249-331. SHANNON, C. E., & MCCARTHY, J. (Eds.). Automata studies. Annals of Mathematics Studies, No. 34. Princeton, N. J.: Princeton University Press, 1956. SUPPES, P. Stimulus-response theory of finite automata. Journal of Mathematical Psychology, 1969, 6, 327-355. (a) SUPPES, P. Stimulus-response theories of automata and TOTE hierarchies: A repIy to Arbib. Psychological Review, 1969, 76, 511-514. (b) RESTLE,
RESTLE,
RECEIVED:
September 17, 1973