A mathematical approach to semantic network development

A mathematical approach to semantic network development

tlulle~rn ofMurhumurrco/Hdo~ Vol. 47. No. 5. pp. 6?Y-650. (IOY2-X24O/X5S3.(1(1 I YX5 Perpmon Prmted m Great Britam 0 IYH5 Swcty Ior Malhematlc...

1MB Sizes 0 Downloads 196 Views

tlulle~rn

ofMurhumurrco/Hdo~

Vol. 47. No. 5. pp. 6?Y-650.

(IOY2-X24O/X5S3.(1(1

I YX5

Perpmon

Prmted m Great Britam 0

IYH5 Swcty

Ior Malhematlcal

+ 0 OU Prex

Ltd

Biology

A MATHEMATICAL APPROACH TO SEMANTIC NETWORK DEVELOPMENT n

SAYURI MASUI Department of Physics, University of Toronto, Toronto, Ontario, Canada

l

CHARLES J. LUMSDEN* Department of Medicine, Room 73 13, Medical Sciences Building, University of Toronto, Toronto, Ontario, Canada M5S lA8

Beginning from the notion of semantic network structure, we develop a quantitative description of how much can be learned by an animal whose developmental programme is a set of co-acting epigenetic rules. In the model considered, the activity of the rules regulates the size, connectivity and innate information content of the semantic network. The network itself is associated with a behavioral repertoire. The modeling approach shows how to begin accounting for the effects of both genetic and environmental information, in a manner that quantifies the roles of specific epigenetic rules for psychological development. In previous models the Shannon-Weaver information content I of a semantic network follows power laws I a NC, with N the number of interrelated concept elements in the network and ,$ a scaling exponent labeling a universality class of semantic networks, Our calculations provide evidence that epigenetic rules of the type considered, involving both innate and learned semantic network components, sustain a new universality class forwhichg= 1.

As recently as the early 195Os, the idea of dealing in a 1. Introduction. constructive theoretical way with memory representations and the cognitive processes acting on them was not widely held. The models of the classical association theory of learning and memory which prevailed were still confined almost entirely to the idea of simple one-to-one connections between elements of learned information, although connections could vary in strength. Thus, the models had the logical form of an “ensemble of pairwise connections between stimuli and responses, representations of items, or homogeneous abstract elements” (Estes, 1982). This theoretical structure was not promising as a complete model for interpreting the complex organizational aspects of human memory, but was, after some extension, applicable to certain specialized cognitive tasks, such as list memorization. * To whom correspondence

should be addressed.

629

630

S. MAW1 AND C. J. LUMSDEN

Integral to improved theories of the representation of knowledge in the brain is a particular class of information-encoding structures called semantic networks. In this paper we will discuss a formal approach to semantic networks that is based on a mathematical treatment of processes regulating their formation. We proceed (Section 3) to develop a specific model of semantic network development, followed by a solution of the model and (Section 4) an examination of the limitations and weaknesses of our approach. Despite obvious limitations, we find that our approach offers a promising way to quantify overall properties of semantic networks, such as their storage capacity. In particular, they present convenient and somewhat novel means of arriving at analytical expressions for network development. This is a problem that is difficult to handle with existing methods. In the present model, however, the joint effects of nature and nurture are represented in a way that evaluates their effect on network structure. The calculations we present suggest that innate factors, when present, may exert a strong, specific biasing action on the network designs achievable through learning. Different factors appear to create different universality classes of networks, between which overall properties such as information content bear qualitatively different relationships to underlying parameters such as network dimensionality and connectivity. We find evidence that the semantic networks formulated in the present work constitute a new universality class of knowledge structures. In order to define the basic terminology and place our results in context, background on semantic networks is briefly summarized in the following section. 2. Semantic Structures: From Chains to Networks. Tulving (1972) distinguished ‘between episodic memory and semantic memory. Episodic memory and semantic memory are intimately related since much of semantic memory develops from information stored in episodic memory. There exists no sharp boundary between the two memory subsystems. While episodic memory preserves the class of information which deals with temporal and spatial features of the events constituting an individual’s experience, semantic memory embodies an individual’s general conceptual information (for example, definitions, knowledge about the world and linguistic abilities). Thus semantic memory is concerned in part with the ability to construct abstract internal representations of the external world. The earliest models of the way knowledge is stored in semantic memory envisaged Estes’ ensemble of pairwise connections. The extended model, called the chain association model, used directional associations between successive items experienced to form the memory structure. It was thought that once these directional associations were established, seeing or hearing

SEMANTICNETWORKDEVELOPMENT

631

the name of one item on, for example, a word list, would function as the stimulus to recall the next items on the list as a response [e.g. review and commentary in Estes (1983)l. The fact that the memory of list structures did not disintegrate completely if one item was omitted during recall led to the idea that weak associations existed between non-adjacent items. Furthermore. the fact that backward recall, the elicitation of a stimulus item by giving the response item as a cue, could occur necessitated the incorporation of the idea of reverse associations. However, the greatest single factor in favour of memory structures more complex than chains was the study of free recall, which indicated that the structure of semantic memory had some type of hierarchical or netlike organization. An influential network model of human semantic memory based on these caveats was developed by Quillian (1969) in a computer simulation program called the Teachable Language Comprehender (TLC). Quillian defined comprehension as the relating of new input information to information previously stored in memory as general world knowledge. The knowledge structure chosen was a hierarchical network of nodes representing concepts, which were meaningful components in semantic memory, interconnected by directed links labeled to indicate the relationship between the connected concepts. Whereas the chain association model may be regarded as a linear assemblage of linked conceptual elements representing items of, say. a list, a semantic network is more readily envisaged as a multi-dimensional crosslinked assemblage of concept nodes (see Figs l-4). We note that chemical polymers may be classified as linear, branched, or multi-dimensional, depending on their structure (Volkenstein, 1963; Odian, 1981) and a similar taxonomy can be conveniently applied to semantic networks (Fig. 5). In Quillian’s hierarchical structure, the first link of a concept node is connected to a node representing its immediate superordinate category. Other links run to nodes representing properties or attributes that apply to the concept. Such a structure can represent the details of extensive and complex knowledge, with features of cognitive economy and inferential capacity. As a concrete example, consider the directed graph representation of a portion of semantic network shown in Fig. 1. The links are directed and labeled to indicate the exact relationship between the concepts that they connect. The links can be of various kinds: isa (= ‘is an instance of’)? applies IO. Jlas-as-part. seJ?soFy image. motor-control image, etc. Note that in the graph all nodes are labeled by words corresponding to the concept, but they represent the concept rather than the word. The semantic network includes sensory information and motor control information that provide reference to perceptual experiences and real actions, thus adding specificity to the semantic components in the memory. In this way, the circularity of defining concepts in terms of others (as found in a dictionary) is avoided.

632

S. MAW1 AND C. J. LUMSDEN

(~STI&H) Figure

1. An example

Figure 2.

of a semantic network, drawn according theory of cognitive economy.

to a strong

An assemblage of stimulus-response pairs comprising associative memory (see Section 1).

a primitive

SEMANTICNETWORKDEVELOPMENT

Figure 3. responding

The chain-association model for semantic network structure corto the memorization of a list of words. Input list: ‘apple’, ‘frog’, ‘tomato’, ‘card’, ‘dog’, ‘red’, ‘life’, ‘two’, ‘king’, ‘tree’ . . . .

I

! /\

/

I






-

I

XC&--

- -

I

p<


/

i

-CC&

I

YC26>‘I

\

-
I

/

\

I

\I -

C27>

I\

/
/

\ I


I-

-=29>





/

Figure 4.

633

A semantic

network

as a cross-linked dimensions.

in two

assemblage of nodes /




/

-‘cc,>

-



cc.> -

-

-

-

ah> /

iCZl> / -



/

cc.> -

-

-

\

\

\

\


\

\

Comb

---

/


-



-

a.>

-



-

-



/ <,C,?> 2, /

/

-

Lc



>

I’ ’ I Dendrh

Figure


5.

>

-

?C.>

; I I - - - - as> -



I I - --‘cc,,>- CC&-

-

\ Crosshnked

Types of linear, branched, and crosslinked semantic based on polymer analogs (after Odian, 198 1).

networks,

634

S. MASUI AND C. J. LUMSDEN

Quillian proposed a principle of spreading activation for these networks: a node representing a given concept is activated when a person sees, hears, reads or thinks about the concept. Activating one node activates the nodes adjacent to it, and these in turn activate other nodes. The activation decreases with both time and distance and takes place as long as the activation level remains above some threshold. In this way, the activation eventually spreads throughout whole segments of the semantic network. Quillian’s spreading activation hypothesis was tested by Collins and Loftus (1975) and a number of other investigators, most of whom used speeded verification tasks.* The experimental evidence indicated that important modifications of the model were required. A concept of semantic distance -that is, the degree of relatedness of two concepts-needed to be incorporated into the semantic network model. ‘Ostrich’ and ‘Canary’ are both instances of ‘Bird’, but ‘Canary’ is considered by many subjects to be more prototypical of ‘Bird’, and thus is semantically closer to ‘Bird’ than ‘Ostrich’. Studies by Schaeffer and Wallace (1969, 1970) and by Wilkins (197 1) showed that the degree of semantical relatedness controls the reaction time required to compare or. categorize concepts. Moreover, Conrad (1972) provided data which showed that propositions and properties can be stored at more than one node in the semantic network, but that they are probably not stored at every instance of a category to which. the property applies. The model of semantic memory proposed by Collins and Loftus (1975) was a revision of Quillian’s model that took these two characteristics of real human memories into account. The revised model did not assume a strict hierarchical structure as did Quillian’s original model, but a completely connected network structure in which all the concept nodes were directly or indirectly linked. Thus Collins and Loftus adopted a weak theory of cognitive economy in preference to Quillian’s strong theory of cognitive economy. In a strong theory of cognitive economy, a concept would be linked directly to its superordinate category by means of an isa relation and properties and propositions applying to all the members of a given class would be stored with the class concept node and not with any of the nodes representing the class members. Although a strong theory would make memory structure as simple as possible, it could also increase processing time during retrieval since frequently used associational pathways such as ‘Dog’ to ‘Animal’ would have to go through less frequently used nodes such as ‘Canine’ and ‘Mammal’. The weak theory allows for some direct connections such as * In this method, the subject is asked various questions such as “Is a canary a bird?“. the correct answering of which involves the use of personal knowledge of concepts and categories in the semantic network. The reaction (or response) time gives an indication of the distance between nodes in the semantic network and of the various information-processing operations involved in the retrieval of the required information.

SEMANTICNETWORKDEVELOPMENT

635

‘Dog’ to ‘Animal’ for frequently co-elicited concepts, and thus tries to strike a balance between a simple representation of knowledge and a shorter processing time. In terms of semantic networks, a theory of psychological development is a theory of network formation. Recent data (for reviews see, e.g., Lumsden and Wilson, 198 1, 1983; Lumsden, 1983) indicate that information encoded in the genome may place constraints on the way semantic networks grow. While some of these constraints may be rather general stipulations about overall network size or network complexity (e.g., Lumsden and Wilson, 1981, pp. 333-341), others, such as linguistic and sexual preference constraints, appear to be highly specific (Shepher, 1971; Chomsky, 1980; Keil, 1981). There has been some initial exploration of constrained semantic network growth using grammar-theoretic and algorithmic simulation techniques (Mayoh, 1974; Nagl, 1976; Savitch, 1976; Wexler and Culicover, 1980; Berwick, 1983). However, few analytical results have thus far been documented and there is a pressing need for tractable models. The study of semantic network models, both kinematic and dynamic, has diversified since the basic work of Quillian (1969). Collins and Loftus (1975) Anderson and Bower (1973) and their colleagues (for reviews see Lindsay and Norman, 1977; Wickelgren, 1979; Wilson, 1980; Medin and Smith, 1984). Concepts of node, link, semantic distance and spreading activation have, however, been widely adopted by theoreticians in this area. These ideas provide us with sufficient background to develop useful connections between semantic network theory and our modeling approach.

3. A Two-dimensional Semantic Lattice. In this model, concepts exist within a knowledge space in which the points represent concepts that can become part of a semantic network. The distance between two points is a measure of the semantic distance between the two concepts they represent (Section 2). The space is considered to be quasi-discrete in the sense that the allowed points form a lattice (in d dimensions), but the density of the points may be arbitrarily large. In the simplest case d = 2 and the allowed points form a two-dimensional net as in conventional semantic network representations (Figs I and 4). The semantic network is to consist of a total of N concepts represented by points in the knowledge space, no of which are fixed in the space by information encoded in the genome, and N - n, of which are variable because they must be learned. In this way we are able to represent joint contributions of nature and nurture to the conceptual content of the network. Its development obeys the epigenetic rule

636

S.MASUIANDC.J.LUMSDEN

Rl : Number

of innate concepts (fixed nodes of the semantic lattice) = n@

Number of learned concepts (variable nodes of the semantic lattice) = N - n@ We will be interested in the development of relatively complex networks, in which N and no are % 1. The fixed points are assumed to be uniformly distributed in the knowledge space R2: Spacing between fixed nodes = m units in both lattice directions laid out in a square array. Although the N - no concepts to be learned are variable, they are subject to the epigenetic rule R3: The spreading activation distance 1 linking fixed nodes through learned concepts is a constant. Each fixed node is so linked to at least one other fixed node. Rule R3 is represented in the model by means of node-link chains that lie on the points representing the concepts learned. Thus, N - no learned concepts are contained in chains (activation pathways) of length 1 connecting the fixed nodes. For convenience we scale the unit of spreading activation distance between adjacent concept nodes in the lattice to be equal to 1 unit (1 ‘quillian’). The number nL of learned concepts in a chain of length 1 quillians is then 1 - 1. There are a number of cases to be considered: 0 < 1 < m, m G 1< 2m, 2m G 1 < 3m, and so on. For 0 G I< m the chains cannot span the distance between fixed nodes and hence can be fixed only at one end. In this model such configurations are forbidden, since chains originating from fixed nodes are required to eventually connect to other fixed nodes in the innate repertoire (so no = 0 or 1 are forbidden, and in general no> 1). For m < 1-C 2m, chains connect nearest and next-nearest neighbors. For 2m < 1 < 3m, connections to many other neighbors become possible. Thus as the semantic distance 1 becomes greater, the multiplicity of connections that may occur increases. The problem of evaluation also becomes more complex. To make the model tractable with the methods at our disposal we introduce the constraint that each chain of length I> m join two nearest-neighbor fixed nodes R4: Spreading activation pathways nearest-neighbor fixed nodes.

join

SEMANTIC NETWORK DEVELOPMENT

637

The associational structure of the network is therefore one in which conceptual linkage between very different fixed nodes (say, a sensory processing scheme for mate recognition and a m6tor schema for vaginal contractions) is built from connections with fixed nodes of the basic repertoire possessing an intermediate degree of similarity (say from mate recognition to courtship pattern to copulatory behavior to vaginal contractions). In its use of indirect links R4 reflects a strong theory of cognitive economy (Section 2). At this point we need to choose the number of concept chains linked to any given fixed node. A survey of semantic networks published in the literature (e.g., Lindsay and Norman, 1977; Brainerd, 1983; Sowa, 1984) indicates that, on the average, somewhere between two and four links come into a node. To allow for complex configurations the maximum number connected to each fixed node is chosen to be equal to four. RS: Each internal fixed node in the square array of no fixed nodes is linked to four other fixed nodes. Each fixed node on the boundary array is linked to two other fixed nodes if it is a corner node and three other fixed nodes if it is an edge node. Each learned concept is linked to two other nodes, forming a segment of the activation pathway between two fixed nodes. The structure of the semantic network generated under these conditions can be called a square concept lattice (see Fig. 6). Semantic networks with greater connectivity than allowed in this knowledge space can be treated similarly, with generalization of the methods used here to higher dimensions. We will return to this below. Although highly simplified compared to a real semantic network (e.g., Fig. 1). the model has several useful features. A theory of cognitive economy is incorporated. Moreover, both hereditary and environmental factors are considered to influence the contents and function of the network. Certain information essential to survival and reproduction, such as fixed action patterns ‘involved in mating, escape, orientation or prey retrieval, are essentially hardwired and are represented by fixed nodes. Activation recalls these nodes in a specific order. In addition to the hardwired content there is learnable information, which could in principle be extensive relative to the hardwired content (no/N<<
638

S. MAW1 AND C. 3. LUMSDEN

Figure 6. A two-dimensional square concept lattice: fixed node spacing m = 7, number of fixed concepts no = 9. The fixed concepts are indicated by heavy dots.

The rationale for considering the effects of such a constraint is as follows. Taken singly, the fixed nodes account for no more than fragments of innate behavior. If assembled they could form a basic repertoire of adaptive behavior for the organism, modified and refined by learning. The stipulation of connectivity assures that each fixed node is a part of the basic network. The bound-on semantic distance 1 guarantees that activation will have a predetermined probability of spreading between the fixed nodes and activating the elements of basic repertoire. Allowing I to be arbitrary would mean that fixed nodes could become isolated from one another as the learned information joining them grows increasingly complex. Such nodes would in a practical sense be deleted from the basic adaptive repertoire. The network therefore models in a crude way a knowledge system in which learnable information augments innately determined elements. Learned knowledge is essential to the network because the fixed nodes are not directly linked. They require bonding through intermediate, learned nodes. Since any two nodes in the lattice can be joined by a large variety of pathways, each representing a different chain of semantic connotation, the model allows for diversity in semantic representation as well as in the actual elements of fixed and learned information. It is true, moreover, that the detailed structure is important in understanding the properties of semantic networks, just as the complete structure and properties of the individual components of physical polymers are necessary

SEMANTICNETWORKDEVELOPMENT

639

in the theory of polymers to explain their individual properties. However, it is also the case that the common features of physical polymers (i.e. their macromolecular state and their chain structure) make possible their general theoretical consideration. Similarly, it is possible that the common organizational properties of semantic networks, namely their massive network structure of nodes and links, make it possible to consider a theory of semantic networks in a general way. The prospect is that the detailed microscopic structure and properties can be set aside through a kinetic theory of semantic networks that emphasizes their overall macroscopic properties. If feasible, such an approach would complement existing simulation methods, which emphasize the detailed node-by-node properties of semantic networks (Quillian, 1969; Mayoh, 1974; Nagl, 1976; Savitch, 1976; Berwick, 1983). This is the strategy we are following here. Each configuration of the system of node-link chains corresponds to a possible configuration of the semantic network. The formal problem is then to determine, for /z. fixed concepts and a specified activation distance 1. the number of possible semantic network configurations there are, given rules RI-R5. In other words, how much information in the Shannon-Weaver sense must the learning experience provide in order to specify one particular network configuration out of all that are possible? Using standard definitions from information theory (Brillouin, 1962), the amount of information (in bits) is I = log,P

(1)

where P is the number of possible configurations of the network. The problem of evaluating equation (1) given epigenetic rules Rj has previously been considered for networks that are shaped by much less specific contraints on their development. Bremermann (1963) and Lumsden and Wilson (1981, pp. 333-341) considered the class of networks governed by the epigenetic rules Epl:

no=0 N-no

Ep2:

= N>>>l.

Each node is connected to any of K others by K distinct relations.

finding that 1%N[(N-

l)log,(N-

1)/e-(N-

1 -K)log,(N-

1 -K)/e].

(2)

They also evaluated I in the class of knowledge structures composed of true/false propositions containing a total of N words in sentences of K words obeying the epigenetic rules

640

S.MASUIANDC.J.LUMSDEN

P 1: Total number of words = N P2: Number of words per sentence = K for which it was established that IrNK.

(3)

The rule sets S, z {Ep 1, Ep2) and S, s {Pl , P2} define universality classes of semantic networks in which I varies as NE, where the exponent t 1 2 for Sz and t Q K for &. The dependence of I on the network parameters, N, 1 and m for the class of knowledge structures defined by S1 3 {Rl , R2, R3, R4, R5) will therefore be of particular interest. We now proceed to evaluate. the total number P of possible configurations of a semantic network consisting of a total of N concepts with no fixed nodes and N - no nodes to be learned distributed among the chains. For the purposes of enumeration we can consider a typical fixed node on the concept lattice as the source of two chains that proceed outward and as the sink of two incoming chains. If every fixed node is the source of two chains, then the total nuinber of chains is 2no. But this clearly overcounts the contributions of the nodes along two of the four edges of the array. These can be the source of at most one chain each. For large no each edge in the square array contains to a good approximation fi fixed nodes, for which the estimate of two chains overcounts by one chain per node per edge. With this correction the total number of chains in the semantic network then follows at once as 2no - 2fi=

2(no--+i&

Therefore, the number of learned concepts per chain of length 1 quillians is N-no W0

(5)

-@d’

For each of these chains we need to determine the number of ways a chain of length 1 can be laid down between the two fixed nodes forming its ends. These nodes lie a distance m apart in the x or y direction in the semantic lattice space. Let us define a coordinate system in which a representative fixed node is located at the origin (Fig. 7). Two node-link chains begin at the origin and terminate at Pl = (m, 0) and Pz = (0, m), respectively. We now calculate the number of configurations of length I with ends fixed at 0 = (0, 0) and Pz = (0, m). In order to proceed it is helpful to define the several quantities. Let

SEMANTIC NETWORK DEVELOPMENT . . . . .

. . . . .

Y

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

.

.

.

.

.

.

.

.

.

.

641

.I.................

::::::,:::::::

.

.

.

.

.

.

.

.

..*........

. . . . . . . . . . . .

X ..I.....

P,=(m,O)

O=(O,O)

Figure 7. An x-y coordinate system for a representative fixed concept. For combinatorial evaluation two chains are considered outgoing to nearestneighbor fixed concepts.

sx

s the total number of links oriented in the x direction (both positive and negative)

S X+ z the number of links oriented positive x direction

in the

S,_

in the

E the number of links oriented negative x direction

and let S,,, S,,+, and S,- be similar quantities -v direction, there is net displacement m. Hence

for the J’ direction. For the

s,, + s,_= s, S,, -S,_

= m,

(6)

or equivalently Sy+= l/W,

+ m)

S,_ = l/2($

- m).

Similarly for the x direction

since there is no net displacement

in the x direction in reaching Pz. Thus

s,, = s,- = 532.

c-0

642

S.MASUIANDC.J.LUMSDEN

Although we can conceive of concept lattices corresponding to real semantic networks in which growth takes place in preferred directions in the knowledge space (Lumsden and Wilson, 198 l), our basic two-dimensional square lattice semantic network grows at the same rate in all directions. Accordingly. equal a priori probabilities hold for a step in either of the allowed directions and we have s, =s,

=+r.

(10)

An expression for the number of configurations of a chain of length I with ends fixed at 0 and Pz can now be written as a product of the number of ways of obtaining a net displacement m in the y direction and the number of ways of obtaining no net displacement in the x direction:

s,,! S,,! s,_

s, !

=

~A(m,

S,,! s,.-!

1)

(11)

where we have used equations (7), (9) and (10). The number of configurations of a chain of length 1 with ends fixed at 0 and P, is also given by equation (11) since in this case the indices x and y need merely be interchanged. Thus for each case there are A(m, 1) possible configurations. Recalling that there are 2(n, - fiO> chains running among the fixed nodes, we see that the total number of configurations of the twodimensional square lattice semantic network is p = [A@,

with the information

l)] 2(no-Gd

(12)

content, from equation (1), of I = log,P = 2(no - m

log,A(m, I).

(13)

Expanding the logarithm and using the Stirling approximation ln(n!)rnIn(n)--n for a semantic network sufficiently 1% m, we obtain after reduction

complicated

and flexible that 1 % 1 and

Since m/l -% 1, the In part of the last two terms can be expanded in powers of 2m/l. By retaining terms up to second order in m/l, we find that

SEMANTICNETWORKDEVELOPMENT

643

Hence (16)

i-1” Iand the amount of information is log,A(m, I) s $2

I = 2(no - +$log,A(m,

I) S 2(n, -

4

no

1 I --

’ (111 m*

in2 -

(17)

But from equation (5) 2(no - flo) Substituting

= (N - no)/(I - 1).

(18)

into equation (12). we find that for 1 S 1 and 1% m IS (N--no)

r 1-

1

L *hl3 m ‘13

1.

(19)

Thus the amount of information required to specify the semantic network increases linearly as the number of concepts to be learned and decreases similarly with the size of the innate fixed repertoire. The information measure I also depends on the ratio l/m of the spreading activation distance to the spacing between the fixed concepts, but in this instance the dependence is highly non-linear. As the activation distance increases relative to the spacing between innately fixed concepts, the amount of information corresponding to the particular semantic network configuration increases somewhat. As 1 goes to infinity, m/l becomes vanishingly small and the information I asymptotically approaches an (L m)-independent, purely linear dependence on N - n@ (See Figs 8 and 9.) The behaviour of the information function specified by equation (19) is novel by comparison to the networks assessed previously. In the wellstudied classes S, and S3 defined above (Bremermann, 1963; Lumsden and Wilson, 198 1: 333-341), I cxNt with t > 1; for S3, $ S 1. These I-functions are rapidly increasing functions of N. In class S,, I increases only slowly, with t = 1. It is also the case that increasing the path length I in the square concept lattice opens up many alternative routes between two fixed concept points [by equation (17) the number scales exponentially as 2’1. One might therefore anticipate the information storable in the semantic network to vary at least as IQ with (Y> 0. Instead we find that I varies as 1 - 1” with

644

S. MAW1 AND C. J. LUMSDEN

1

2

3

4

5

6

7

8

9

10

I Figure 8. The information slope function 1 - [(Z/m)’ In 21-r as a function of m, the fixed node spacing, and 1, the semantic distance connecting fixed nodes through chains of learned concepts. Units are such that m = I = 1 corresponds to one step along the lattice.

15

e/m 100 / -4 ‘2

10

h

10

102 103 (N - no)

lo4

lo5

Figure 9. The information I required to specify the semantic network as a function of N - not its content of learned concepts. The class of ‘flexible networks’ characterized by 2 3 1, 2 >> m (see text) occupy a relatively small proportion of (N - no, o-space.

SEMANTICNETWORIDEVELOPMENT

645

cy < 0. For large values of I the path length is therefore predicted to exert virtually no effect on I at all. The information function is dependent only on the number of concepts to be learned. These properties of I result from the epigenetic rule R2, which specifies a semantic distance of I quillians between the fixed concept nodes connected by chains of spreading activation. The effect of this rule is to make izL dependent upon no and 1. The number of alternative activation pathways scales as 2 in the square concept lattice, but the number of chains constituting independent pathways scales with an exponent of a/l, [ equation (IS)], so that the total diversity of allowed semantic configurations scales as 2”ri’ = 2a. If nL and 1 were independent, I would in fact scale with I”, (Y s 1 [equation (17)]. What is of interest is that the developmental constraint represented by this epigenetic rule is strong enough to reduce the information required to specify the network from a factor of 1. One could describe the action of rule R2 as supplying (I - l)nL bits of information to the developmental program. The following considerations suggest that the relation involving I, (N - no) and I described by equation (19) may have a generality beyond the twodimensional square concept lattice. Let X be a knowledge space modeled by a d-dimensional lattice (square or otherwise) of nodes each possessing coordination number C. Suppose that P, and Pz are two lattice points representing innately fixed nodes in the basic adaptive repertoire and that a chain of nL learned concepts joins them through a chain of length 1 quillians. If we take the links from PI to its adjacent learned concept and the link from the nLth learned concept to P2 as necessarily determined in direction (since the chain must be fixed at both ends), then at each of nL - 1 nodes one of C possible directions along the lattice must be specified,in order to describe the chain. The total number of possible descriptions of the chain pathway as a whole is @L-i). If M is the total number of chains, then the number of possible semantic networks is C M(nL-l). But if the chains contain nL nodes each for a spreading activation distance of 1 = nL + 1 quillians, then M is equal to (A’- no)/nr and the total number of semantic networks varies as

The information

content of this system is Iz((N---Jlog,C>

dZ2,1%1

(20)

and we recover an effective I-independence of I, combined with a linear dependence on Y - JZ,,. I also increases in a logarithmically unbounded manner with the network connectivity C.

646

S. MASUI AND C. J. LUMSDEN

4. Discussion.

Our approach to a semantic network model has provided a quantitative description of how much an animal governed by epigenetic rules of the form Rl-R5 can learn about its environment. Few such models have previously been available, but are needed in evolutionary studies of psychological processes (Lumsden and Wilson, 1981, 1983). The semantic networks considered in previous studies of information content were less realistic than the model dealt with here, and were organized by more general epigenetic rules. The present study therefore appears to comprise a useful step in the study of semantic network development. Our model shows how to begin incorporating the effects of both genetic and environmental information, in a way that quantifies the roles of specific epigenetic rules which canalize mental development. In the present model this canalization leads through repertoires of fixed nodes interlinked through learned information. In the earlier models known to us (see above) the information content I can be approximated by power laws in which

where the exponent ,$ characterizes the universality class S. Our calculations for the d = 2 square concept lattice and its extensions to d > 2 lattices provide evidence that epigenetic rules of the form Rl-R5 create a new class in which I: = 1 (Table I). In further studies it will be of interest to pursue the natural Universality conjecture:

The exponent .$ for a semantic network depends only on the epigenetic rules Rj governing its formation and to test for the existence of other exponents pertinent to 1, m, C, etc. This will allow a more complete characterization of semantic networks and their relations to learning (and genetics) than is available at present.

TABLE I Rule set

Sl s2 s3

*N.l. t See equation (3).

Epigenetic rules

Scaling exponents* I G$l’mP t

v

P

Rl, R2, R3, R4, RS

1

0

0

EP~, EP~ Pl,P2

2

-

-

Kf

-

-

SEMANTICNETWORKDEVELOPMENT

647

Any characterization so obtained will describe the population statistics of finished networks. In order to model the actual network it will be necessary to develop a satisfactory growth kinetics alongside the treatments of network organization. Since this treatment appears to be an important unsolved problem, we would like to conclude our discussion with a preliminary specification of the kind of growth-kinetic modeling required. The strictly ‘regular structure of an elementary physical polymer, as realized in the repetition of the same monomer unit, is an idealization violated most of the time in real chemical systems. Polymers formed from two or more kinds of reactant monomers are called copolymers, and the processes in which they are formed are generally referred to as multiThus chemical polymers are aggregates component copolymerizations. of a large number of heterogeneous units, which differ from each other both in composition and in structure (as in the proteins and polynucleotides). In a ‘polymer of ideas’ approach, a semantic network may be considered as a multi-dimensional copolymer of conceptual nodes linked and cross-linked by associational relations. Polymer-kinetic mechanisms affecting growth in chemical copolymers are classified as step or chain polymerizations. A basic difference between these processes lies in the length of time required for the complete growth of polymer. Step polymerizations proceed by the sequential reaction between the functional groups of reactants. Since any two of the component molecular species can react with each other throughout the course of a step polymerization, the size of the polymer increases at a relatively slow rate from monomer to dimer, trimer, tetramer, pentamer, and so on until near the end of the reaction when full-sized polymer molecules containing large numbers of monomers begin to appear. In contrast. full-sized polymer molecules are produced rapidly after the start of a chain polymerization reaction. Chain polymerizations require an initiator molecule from which is produced an initiator species R* with a reactive center (a free radical, cation, or anion which propagates through the polymer). Through the rapid propagation of reactive centers, large numbers of monomers are added successively. In this mechanism a monomer can react only with the propagating center, not with another monomer. Clearly, the growth of a semantic network cannot in general be considered solely as a step polymerization process, since at each point in its development the semantic network is considered to be one interconnected assemblage of elements. Thus polymerizations of the chain type need to be considered for semantic networks that develop as a fully interconnected system. Standard polymer dissociation kinetics are not applicable to a discussion of the growth of these structures for the following reason: while dissociation in chemical polymers is a strong function of the concentration of monomers in the

648

S. MASUI AND C. J. LUMSDEN

surrounding medium (Odian, 1981), dissociation of concepts (or forgetting) appears to depend more strongly on the internal processes of the individual, especially the network itself. People are selective about what they forget. At this point it is helpful to introduce a distinction between terminating and living polymerizations (Odian, 198 1). Various reaction mechanisms may terminate the growth of a propagating chain, but although chain-breaking occurs, generally termination of the kinetic chain does not occur because a new propagating species is formed in the process. Such polymerizations are called terminating polymerizations. There exist, however, polymerizations that take place under conditions in which there are no effective termination reactions. Non-terminated polymerizations are referred to as living polymerizations and the .product as a living polymer. Growth of a living polymer stops only when the supply of monomer is depleted to a level where the rate of monomer incorporation just balances the rate of dissociation. On the basis of these considerations we suggest that a growth kinetics for semantic networks can be considered as a non-terminating, sequential addition of concept nodes to a living network copolymer with a large and continually increasing number of reactive centres propagating throughout the network. The growth-kinetic theory of such systems is under development. The problem of d = l-dimensional growth of living polymers has been examined recently (Firestone et al., 1983; Rangarajan and de Levie, 1983). Analytical descriptions of kinetics of growth in d = 2 or 3-dimensions, which necessarily involve a consideration of particle stoichiometry, are largely open problems. Possible semantic network applications, as we have seen, make their further development (e.g., Meakin and Stanley, 1983; Majid, Jan, Coniglio and Stanley, 1984) of increased interest. The authors thank Anne Hansen-Johnston for her careful preparation of the manuscript. This work was supported in part by the Natural Sciences and Engineering Research Council of Canada and by the Medical Research Council of Canada. Numerical calculations were performed with the assistance of the VAX 1 l/780 computing system, Division of Medical Computing, University of Toronto.

LITERATURE Anderson, Wiley. Berwick, Ph.D. Brainerd, York,

J. R. and G. H. Bower. 1973. Human Associative Memory.

New York: John

R. C. 1983. “Locality Principles and the Acquisition of Syntactic Knowledge.” thesis, Department of Electrical Engineering and Computer Science, MIT. C. J. (Ed.) 1983. Recent Advances in Cognitive-Developmental Theory. New Springer-Verlag.

SEMANTICNETWORKDEVELOPMENT Bremermann,

H. J. 1963. “Limits

of genetic

control.”

IEEE

649

Trans. Milit. Electron.

MIL7, 200-205.

Brillouin, L. 1962. Science and Information Theory. New York: Academic Press. Chomsky, N. 1980. Rules and Representations. New York: Columbia University Press. Collins, A. M. and E. F. Loftus. 1975. “A Spreading-activation Theory of Semantic Processing.” Psychol. Rev. 82,407-428. Conrad, C. 1972. “Cognitive Economy in Semantic Memory.” J. exp. Psychol. 92, 149-154.

Estes, W. K. 1982. Models of Learning, Memory, and Choice: Selected Papers. New York: Praeger. Firestone, M. P., R. de Levie and S. K. Rangarajan. “On One-dimensional Nucleation and Growth of ‘Living’ Polymers. 1. Homogeneous Nucleation.” J. theor. i?ioZ. 104, 535552. Keil, F. C. 1981. “Constraints on Knowledge and Cognitive Development.” Psychol. Rev. 88, 197-227.

P. H. and D. A. Norman. 1977. Human Information Processing: An Introduc2nd edn. New York: Academic Press. Lumsden, C. J. 1983. “Geneculture linkages and the developing mind.” In Recent Advances in Cognitive-deveZopmenta1 Theory, C. J. Brainerd (Ed.), pp. 123-l 66. New York: Springer-Verlag. and E. 0. Wilson. 1981. Genes, Mind and Culture: The Coevolutionary Process. Cambridge, Massachusetts: Harvard University Press. 1983. Promethean Fire: Reflections on the Origin of Mind. Cambridge, -and-. Massachusetts: Harvard University Press. Majid, I., N. Jan, A. Coniglio and H. E. Stanley. 1984. “Kinetic Growth Walk: A New Model for Linear Polymers.” Phys. Rev. Lett. 52, 1257-1260. Mayoh, B. H. 1974. “Multidimensional Lindenmeyer Systems.” In L Systems. Lecture Notes in Computer Science, G. Rozenberg and A. Saloma (Eds), pp. 302-326. New York: Springer-Verlag. Meakin, P. and H. E. Stanley. 1983. “Spectral Dimension for the Diffusion-limited Aggregation Model of Colloid Growth.” Phys. Rev. Lett. 51, 157-1460. Medin, D.L. and E. E. Smith. 1984. “Concepts and Concept Formation.” A. Rev. Psychol. 35, 113-138. in Biology.” In Nagl, M. 1976. “Graph Rewriting Systems and their Application Mathematical Models in Medicine. Lecture Notes in Biomathematics, J. Berger et al. (Eds), Vol. 11, pp. 135-156. New York: Springer-Verlag. Odian, G. 198 1, Principles of Polymerization. New York: John Wiley. Quillian, M. R. 1969. “The Teachable Language Comprehender: A Simulation Program and Theory of Language.” Communs Ass. comput. Mach. 12,459-476. Rangarajan, S. K. and R. de Levie. 1983. “On One-dimensional Nucleation and Growth of ‘Living’ Polymers. II. Growth at Constant Monomer Concentration.” J. theor. Biol. Lindsay,

tion to Psychology,

104,553-570.

Savitch, W. J. 1976. “Computational mata, Languages, andDevelopment,

Complexity of Developmental Programs.” In AutoA. Lindenmeyer and G. Rosenberg (Eds), pp. 283-

29 1. Amsterdam: North Holland. Schaeffer, B. and R. Wallace. 1969. “Semantic Similarity and the Comparison of Word Meanings.” J. exp. Psychol. 82, 343-346. 1970. “The Comparison of Word Meanings.” J. exp. Psychol. 86, 144-and-. 152.

Kibbutz Adolescents and Shepher, J. 1971. “Mate Selection among Second-generation Adults: Incest Avoidance and Negative Imprinting.” Archs sexual Behav. I, 293307. Sowa, J.F. 1984.Conceptual

Structures. Reading, Massachusetts: Addison-Wesley. Tulving, E. 1972. “Episodic and Semantic Memory.” In Organization of Memory, E. Tulving and W. Donaldson (Eds). New York: Academic Press.

650

S.MASUIANDC.J.LUMSDEN

Volkenstein, M. V. 1963. Configuration Statistics of Polymeric Chains. New York: John Wiley. Wexler, K. and P. K. Culicover. 1980. Formal Principles of Language Acquisition. Cambridge, Massachusetts: MIT Press. Wickelgren, W. A. 1979. Cognitive Psychology. Englewood Cliffs, New Jersey: PrenticeHall. Wilkins, A. J. 1971. “Conjoint Frequency, Category Size, and Categorization Time.” J. verb. Learn. verb. Behav. 10, 382-385. Wilson, K. V. N. 1980. From Associations to Structure: The Course of Cognition. New York: North Holland. RECEIVED REVISED

1 l-1 5-84 6-2-85