Mental representation of length and weight series and transitive inferences in young children

Mental representation of length and weight series and transitive inferences in young children

JOURNAL OF EXPERIMENTAL CHILD PSYCHOLOGY 31, 177-192 (1981) Mental Representation of Length and Weight Series and Transitive Inferences in Young...

1MB Sizes 1 Downloads 53 Views

JOURNAL

OF

EXPERIMENTAL

CHILD

PSYCHOLOGY

31, 177-192 (1981)

Mental Representation of Length and Weight Series and Transitive Inferences in Young Children JOSEF PEKNER, GERHARD STEINER,

AND CHRISTINE

STAEHEL~N

Forty-eight second graders (8 years of age) were trained on length or weight relationships between adjacent members of a five-term series of colored objects. Feedback was visual and of either minimal or strong salience. Differences in weight were assessed by either a balance scale or a spring scale. Results showed that more salient visual feedback reduced the learning effort for length but not for weight comparisons. After training children were tested on all possible object pairs. Children’s comparisons of items by length were very accurate in contrast to their comparisons by weight. An explanation for these findings is suggested by the data from a group of 6-year-olds who were trained on two independent pairs of a four-term series. Test results showed that subjects spontaneously encoded absolute lengths but tended to ignore information about the absolute weight of objects. It is suggested that high test accuracy depends on stimulus material for which the absolute values of the relevant dimension are encoded. The implications for taking test performance as an indicator of “transitive reasoning” ability are discussed.

Trabasso, Riley, and Wilson ( 1975) showed how children and adults solve six-term series problems with sticks of different lengths (A, B, C, D, E, and F by decreasing length). Subjects were extensively trained on the length relationships between adjacent pairs (premises) of this series. When questioned afterward on all possible pairings of the six sticks subjects have equal proportions of correct answers on the trained pairs and on the new (inference) pairs. Furthermore, they responded faster when the two sticks of the test pair were farther apart than when sticks were closer together in the underlying series. Thus reaction time was longest for the originally trained premise pairs. This line of research was directed against claims by Piaget (1924) and Piaget. Inhelder, and Szeminska ( 1948) that young children cannot draw The authors wish to thank A. Cosette Wilson for her help in editing the manuscript. We are also most grateful to the directors, staff, and pupils of kindergarten and elementary school in Kanton Basel-Stadt for their willing cooperation in the studies. Requests for reprints should be directed to Josef Perner. who is now at the Laboratory of Experimental Psychology, University of Sussex. Brighton, Sussex BN I 9QG. England. 177 0022-0965/81/020177-16$02.00/O Copyright @ 1981 by Academic Press. Inc. All rights of rrproducuon tn any form reserved.

178

PERNER,

STEINER,

AND

STAEHELIN

inferences about transitive relations like length. Bryant and Trabasso (1971) showed that even 4-year-old children can draw transitive inferences if one assures that they remember the learned premises at the time of testing. Furthermore, the data by Trabasso et al. rejected the assumption that such syllogisms are solved by combining premises at the time of testing into an inference, e.g., “Since I know that A is longer than B, and B is longer than C, therefore, A must be longer than C.” Using such reasoning subjects would react faster to premise pairs AB and BC than to the inference pair AC. and the probability of answering the question about A and C correctly would not be higher than the product of the probabilities of answering questions about pairs AB and BC correctly. Typically this is not the case (e.g.. Trabasso et al. 1975; de Boysson-Bardies & O’Regan, 1973). Thus it seems that answers to test questions are “read off,” from a representation which is rzol based on memory for training pairs, but which allows direct access to all possible pairings of items. In all existing models this is thought to be achieved by giving no particular pairs of items preferential status. All items enjoy equal status and carry some value. Items can be compared by comparing their values. De Boysson-Bardies and O’Regan (1973) demonstrated that children’s spontaneous strategy is to classify the sticks into three categories, i.e., “long, ” “inconsistent,” and “short.” Obviously such a strategy is not even good enough for memorizing the four premises of a five-term series. Thus when children are trained extensively on the premise pairs they must adopt a more adequate strategy. Trabasso et al. ( 1975) and Trabasso (1975) argued that subjects who reach training criterion mentally arrange items in a spatial linear array, i.e., each item is assigned a place within this array according to its length. Any two items can then be immediately compared without having to consider the intervening items. Unfortunately, this hypothesis does not make any suggestion as to why acquisition rates vary according to feedback employed. For instance 4and 5-year-olds find it much easier to retain the premises when they are shown the difference between sticks (visual feedback) than when they are only told (verbal feedback) which stick is longer and which is shorter (Trabasso, 1975). Verbal feedback tasks can be rendered even more difficult when only one comparative, i.e., either “longer” or “shorter” is used throughout training instead of both (Riley & Trabasso, 1974). An even more important shortcoming of the spatial array hypothesis stems from the fact that the influence of different feedback is not limited to the acquisition phase. Since it is assumed that after training criterion has been reached the same spatial array representation is formed, it follows that test performance should be the same regardless of the kind of feedback given during the preceding training phase. However, closer inspection of test data after visual and verbal feedback training revealed consistent differences. The 4- and S-year-olds in Trabasso’s (197.5) study

TRANSITIVE

1NFERENCES

179

gave correct responses about 95% of the time after visual feedback training whereas the 4-year-olds who were trained with verbal feedback responded to only 69% of the test questions correctly and the 5-year-olds to only 86%. Six- and nine-year-olds were also correct 95% of the time after visual feedback training on a six-term series, but after verbal feedback training 6-year-olds gave correct responses only 80% and 9-year-olds only 87% of the time. For college students there was no difference between the two feedback conditions (Trabasso et al., 1975). There was, however, one exception to this pattern. Four- and five-year-olds in the study by Bryant and Trabasso (1971) gave very accurate test responses after visual and after verbal feedback training. Data from Riley and Trabasso (1974) show that test performance after verbal feedback can be further lowered when only a single comparative is used instead of both comparatives. Under these conditions not only most of the 4 i-year-old subjects failed to reach training criterion, but the highly selected sample of subjects who entered the test phase gave correct responses only 66% of the time compared to 80% when regular verbal feedback consisting of both comparatives was given. Adams (1978) investigated the influence of salience of visual feedback in 5-year-olds. This factor had a strong influence on the speed of acquisition of the memory structure only and not on test performance. During the test phase both groups of subjects responded very accurately, with a proportion of about 95% correct responses. In summary, the reviewed data show that children at all ages retained a very high performance level during testing after visual feedback training regardless of salience. Whereas young children who were trained with verbal feedback gave considerably fewer correct responses during testing, even though they had been trained to the same criterion as children under visual feedback. This difference in test performance after visual vs verbal feedback training diminished with age. It was not apparent in college students. Furthermore, test performance after verbal feedback was further lowered when only a single comparative was used instead of both. In view of these data it becomes difficult to maintain the position held by Trabasso (1975, 1977) that regardless of feedback condition and age the eventual memory structure is a linear spatial array of objects. Furthermore, Trabasso’s proposal has the additional disadvantage that it raises the unanswered quetion from where 4-year-olds obtain the knowledge that a spatial array is an appropriate means for representing differences in length. In order to overcome these difficulties we offer the following alternative hypothesis: On the one hand, when visual feedback is given, seeing sticks of different length induces children to represent the stimulus material as five sticks of different length. This stands in contrast to Trabasso’s hypothesis where it is assumed that children translate information about

180

PERNER,

STEINER,

AND

STAEHELIN

differences in length into left-right relationships of a spatial array. which seems to be a superfluous symbolic step. The time it takes to switch from the initial labeling strategy (de Boysson-Bardies & O’Regan, 1973) to this representation depends on the salience of length differences (Adams, 1978). Yet, once such an encoding has been established children have a simple but sound representation of reality which assures accurate test performance. On the other hand, when verbal feedback is given children do not see sticks of different length but they only hear of sticks being referred to as “longer” and “shorter.” Thus these children may adhere to their initial labeling strategy and not even reach the training criterion (Riley & Trabasso, 1974). Those children who do reach criterion may have tried to amend their strategy with special rules so that they were just able to cope with the four training pairs. Or, some children may build up different strengths of association for calling a stick “long” or “short” (cf. Trabasso, 1977). Similar modeis have been used to explain “transitive inference” responses by monkeys (McGonigle & Chalmers, 1977) and by 4-year-old children (Furth, 1977) in pseudotransitivity tasks. A variety of similar strategies is possible and is probably used by different individuals. All these strategies have in common that they are not very insightful and stable representations of the stimulus material. Hence it takes a long time to memorize the training pairs correctly and even after training criterion is reached such representations are not a good basis for answering the whole variety of test questions. The switch from the familiar four training pairs to all possible pairings may bring out the remaining uncertainty about the stimulus material anew and the overall test performance will be low. It has to be emphasized that a labeling strategy does not entail an encoding of premise pairs but provides verbal values of length for single items. Thus a labeling strategy may enable children to pass Bryant and Trabasso’s (1971) criterion for “transitive reasoning” (cl’. de BoyssonBardies & O‘Regan, 1973). Furthermore, if individual subjects use slightly different categorizations of items then group means can even produce the test response patterns emphasized by Trabasso et al. (1975). Therefore, what is of present concern is not the response pattern across test pairs but rather the overall test accuracy indicating stable or unstable representations. The above proposed hypothesis was generated to explain already reported differences in acquisitions and test data from various feedback conditions. In order to put our hypothesis to a further test it was attempted to find new stimulus material such that the visual display does not suggest to subjects an encoding in terms of objects that vary in their value along the relevant dimension. This was thought to be the case for objects of different weight when they are of the same size and when their difference in weight is assessed by a balance scale. No direct visual

TRANSITIVE

INFERENCES

181

perception of each object’s weight is possible, only for pairs of objects. If the visual experience is directly encoded than “A is heavier than B” will be represented as “A going down and B up” and “B is heavier than C” as “B going down and C up.” From this representation no answers to questions about A and C can be given. Thus in order to encode the differences in weight similar strategies as in the case of verbal feedback would have to be employed. The predicted difficulty with weight comparisons can be alleviated if the difference in weight is determined by a spring scale. The weight of each object now becomes directly visible and can be represented as “A extends spring for distance X, B for distance .v, and C for distance z.” This representation allows answers to questions about all possible pairings of items. Thus if the proposed hypothesis is true children seeing objects of different length or objects weighed on a spring scale can be expected to learn faster when feedback is of high salience than when it is of low salience. In the case of the balance scale learning should take long regardless of salience. Test performance of children who have to memorize length differences and of children who have to memorize weight differences assessed by spring scale is expected to be better than test performance of the balance scale group. EXPERIMENT

1

Method

Subjects

Pilot work with feedback of low salience showed that a majority of 6-year-old kindergarteners did not reach training criterion. Thus older children, namely 59 second graders (3 1 boys and 28 girls), 7.1 to 9.1 years old (mean age = 8.1 years) from an elementary school in Base1 participated in this study. A total of 48 children, four boys and four girls in each of the six experimental groups, completed the experiment. The remaining 11 subjects failed to reach criterion during concurrent pair training. One of them was in the Length group, four were in the Spring Scale, and six in the Balance Scale group. Apparatus and Material

All stimuli were presented in a specially designed presentation box. This was an unpainted plywood box, 30 cm wide and 36 cm high on the front wall, which faced the subject. The size of the box increased slightly toward the open back, which faced the experimenter. A window (25 cm wide and 16 cm high) was centered in the upper half of the front wall. This window could be opened by dropping a guillotine door which switched on a chronometer (Heuer-Microsplit). The chronometer was stopped when

18’

PERNER,

STEINER,

AND

STAEHELIN

the subject pressed one of two buttons (2 by 1.5 cm), which were mounted 6 cm apart from each other on a small box 12 cm in front of the display box. The interior of the display box was designed to allow for the different measuring operations for the three measurement conditions. The stimulus material for the LOM Salirncr Len&z grotrp were five wooden blocks, 4 cm wide and 2 cm thick. Their length varied from 11.6 cm decreasing by steps of 3 mm to 10.4 cm. The High Safiet7c.e Letlgrh group was shown blocks of the same width and thickness, whose length, however, varied from I5 cm, in Z-cm steps, to 7 cm. The blocks for both groups had a hook on one end, by which they could be hung on two rings, 5 cm apart, near the upper edge of the display window. Thus the upper ends of both displayed blocks were always at the same level. Their difference in length was clearly apparent at their lower ends. The length difference could be concealed by the guillotine door opening so far that only the top 5 cm of the sticks became visible. Visual feedback was given by letting the door drop completely. The stimulus material for the LON- Sczlietrcc~ Spritzg Scde group were five blocks of the same dimension as the blocks for the Length groups, but all of them uniformly 1 I cm long. These blocks too had a hook on one end with which they could be attached to the end of the metal springs (cf. Experiment 1). The blocks were filled with different amounts of lead so that block A (heaviest) extended the spring 5.6 cm. Every next block in the series (B. C, D. and E) extended the spring for 3 mm less, i.e.. 5.3,j.O. 4.7. and 4.4 cm. The blocks used for the High Scrlictzcr Spring Sctrle group were filled with so much lead that block A extended the springs for 9 cm, every other block for 2 cm less, i.e., 7, 5. 3. and I cm. The same blocks were used for the two Balutzcr Sccrle groups. Instead of the two metal springs a beam balance was inserted below the ceiling of the presentation box. The beam had two metal rings, about 6 cm apart. one at each of its ends. For the LOM, Salietlcc group the beam was allowed to move only so much that the heavier object hung 3 mm lower than the lighter object. For the Hi& Salience group the beam’s movement allowed a difference in height of 2 cm between the weighed objects. Blocks hanging on the spring scale or on the balance beam could be aligned at the same level by a retractable wooden support. When supported, the objects’ weight difference was not apparent; when the support was withdrawn. the blocks slid downward and their weight exerted its influence on the scales. For all groups the wooden blocks were color coded. They were clad in cardbord sheaves, whose side facing the subjects was colored. Two different random assignments of color to blocks A, B. C, D. and E were used: “Red, blue, green, yellow, black” and “Blue, yellow, black. red. green. ’ ’ Each assignment was used for half the subjects in each experimental group.

TRANSITIVE

INFERENCES

183

Procedure Preliminary training. At first children were familiarized with the general procedure using two practice objects, which were dimensioned as blocks C and D of the experiment proper. The two blocks were orange and violet. The practice was carried out by the following general training procedure: Behind the closed window a stimulus pair was set up so that the difference in weight (length) was not visible. Then a question was posed involving either the unmarked comparative, i.e., “which block is heavier (taller)?” or the marked comparative, i.e., “which block is lighter (shorter)?” At the end of the question the door was dropped and the child had to press as fast as possible the correct button. Then visualfeedback was given, i.e., for the Length group the screen for hiding the difference in block size was dropped completely: for the Spring Scale group and the Balance Scale group the standing support for the two stimulus objects was withdrawn. The feedback had to be evaluated by the children themselves, who were asked to indicate whether their response was right or wrong. Incorrect evaluations were reprimanded by asking the child to pay more attention. Finally, the child had to push the door shut and the next trial ensued. The practice continued until the child had given two correct responses at least one to questions with the unmarked and one to questions with the marked comparative. When the subject mastered the procedure the two practice blocks were packed away and five new blocks were produced. According to a random sequence they were lined up in the window of the display box so that their color but not their difference in weight or length could be seen. The child had to name the color of each block. The child was told that all these blocks were of different weight or length and that the remaining time of the experiment would be spent on learning which is lighter (shorter) and which is heavier (taller). Separate pair training. Children had to learn the weight or length differences between pairs of adjacent blocks, i.e., pairs AB, BC, CD, and DE, using the general training procedure described above. Each pair was trained separately until the child had five consecutive responses correct for each adjacent pair. Half the subjects in each group started with pair AB and proceeded toward pair DE. The other half followed the reverse order. On the first trial for each pair children had to guess the answer. For each pair children were equally often questioned with the unmarked as with the marked comparative, and each comparative was equally often used when the lighter (shorter) item was presented on the right side as when it was presented on the left side. Each combination of comparative and left-right presentation appeared once in each block of four trials. The order of combinations was randomized within each block, once for all subjects. When subjects had reached criterion on their last pair, training continued with all four pairs concurrently. Concurrent pair training. For single trials the same standard procedure

184

PERNER,

STEINER,

AND

STAEHELIN

was used as for separate pair training. However, the four test pairs were now presented on subsequent trials. Thus each block of four trials contained all four training pairs. Across 16 trials each training pair of blocks was trained under all four combinations of comparative and left-right presentation. The temporal order of combinations was randomized for each pair independently. Overall two different randomizations were used, each for half the subjects. Subjects were trained to a criterion of four consecutive correct trial blocks. Subjects who did not reach this criterion within 25 blocks were retrained the following day starting with separate pair training. Subjects who again failed to reach criterion in concurrent training were eliminated. Sessions over 25 blocks lasted for 30 to 45 min. After reaching criterion, subjects were transferred to the final test phase. Testing. In this phase the same procedure was followed as in training except that no feedback was given. Each child was tested four times on each of the 10 possible pairings of the five blocks. These 10 pairs included the four pairs of adjacent blocks used in training and six new pairs (AC, AD, AE, CE, BE, and BD). The four test questions on each pair included two questions with the unmarked comparative and two with the marked comparative. For each form of question the blocks were once in the left-right and once in the right-left position. The 40 resulting test questions were randomized with the constraint that each pair occurred once in each of four blocks of 10 trials. Two randomized orders were used, each for half the subjects. At the very end each subject was given six marbles as a reward for participating in the study. Results

and Discussion

Separate Pair Training A 3 (measurement condition) x 2 (salience) analysis of variance was carried out on the average last error trial for the four training pairs. No significant effects were observed (all F values under I). Thus subjects in all groups had no difficulty in discerning visual differences and could memorize them for independent pairs after 1.34 trials on the average. Concurrent

Pair Training

Table 1 shows the means of last error trial blocks. The interesting result is that the High Salience Length group needed only about half as many trial blocks to their last error (13.5) as subjects in the other five groups (26.1 blocks): F( 1,42) = 6.01, p < .05. The comparative ease of acquisition for the High Salience group was to be expected from the study by Adams ( 1978). Still. her group of 5-year-olds who were trained on l-cm differences made fewer errors (9.85) than our 8+-year-olds on 2-cm differences (14.9 errors).

TRANSITIVE

185

INFERENCES TABLE

EXPERIMENT 1: MEAN

LAS

1 ERROR

TRIAL

BLOCK

Measurement condition Salience High Low

Length

Spring

Balance

13.5 25.0

33.5 23.5

23.6 24.3

A very interesting result is that there is no indication that salient differences were of any help to the High Salience Spring Scale group. In fact this group had to be trained for !O blocks more than its Low Salience counterpart. A 5% single-tailed confidence interval shows that the actual population mean for High Salience Spring Scale groups could be at best 1.12 trial blocks better than for Low Salience groups. A within-subject analysis of variance of last error trials for the four training pairs showed the familiar end-anchor effect (Trabasso, 1975, 1977): F(3,126) = 8.7 1,p < .OOl. The mean last error trials for pairs AB to DE were 12.3, 19.7, 19.7. and 13.8.

Two 3 (measurement condition) x 2 (salience) x 10 (test pair) analyses of variance were carried out, one on (arcsine transformed) proportion of correct responses and one on average reaction time for each test pair. The first two factors are between subjects and the third factor is within subjects. The analysis for correct responses revealed two significant effects, measurement condition and test pair. The means for the measurement conditions (F(2, 42) = 6.56, p < ,005) are shown in Table 2. Two orthogonal contrasts show that responses of the Length group were more accurate than responses of the Spring Scale and Balance Scale groups combined: F(1,42) = 11.18, p < .005. The difference between Spring Scale group and Balance Scale group is not statistically reliable: F( 1.42) = 2.01, p > .lO. This pattern shows that memory structures for length are more stable during testing than those for weight. The other significant effect was test pair: F(9, 378) = 2.38, p < .05. A rank ordering of the 10 means put the four trained pairs above the six TABLE MEAN

PERCENTAGE

Pairs Training pairs Inference pairs All pairs -

CORRECT

RESPONSES

2 IN TEST

PHASE

OF EXPERIMENT

2

Length

Spring

Balance

Total

96.5 94.8 95.5

90.2 82.5 85.6

84.8 16.3 79.7

90.5 84.5 86.9

186

PERNER.

STEINER,

AND

STAEHELIN

inference pairs. A contrast analysis proved this separation significant: F( 1,378) = II .Ol,p < .005. The relevant means are shown in Table 2. The fact that responses to training pairs were more accurate than responses to inference pairs speaks against the view that an overall item-based representation is formed during training. However. the means in Table 2 show that there is virtually no difference between training and inference pairs in the Length group. The significance of factor test pairs rests mainly on the results from the Weight groups. Thus the means from the Length group do not discount the possibility that these children have formed an overall item-based representation, which is compatible with previous findings on length tasks (summarized in Trabasso, 1977, Table 1 I-l). The analysis of reaction times showed the interaction between measurement condition and test pairs significant: F(18, 378) = 1.99, p < .02. Closer inspection of the relevant means in Table 3 revealed that subjects in the Length and Balance Scale groups tended to react faster to pairs containing at least one end-anchor than to inner pairs, whereas subjects in the Spring Scale group did not show such a trend. A constrast corresponding to this observation was computed. It accounts for 51% of the variation that is due to the significant interaction and it is statistically reliable: F( 1, 378) = 18.28, p < .OOl. In the reaction time analysis no other effect is significant, except the factor test pairs, which should not be interpreted because of its significant interaction with measurement condition. The pattern of reaction times in the Length and Balance Scale groups replicates the usually very strong end-anchor effect found in previous studies (Trabasso et al.. 197.5; Riley, 1976). The formation of end-anchors or at least their influence on reaction times seems to be inhibited in the Spring Scale group. Children’s

Explcrm~tions

After the test phase children were asked to explain how they knew the right answer to each test pair. Children’s explanation attempts were classified in three categories. (a) Srriation: Twelve children mentioned a series without being able to specify how they represented it mentally. Nine children claimed that they visualized items as a staircase. It is MEAN

Measurement condition Length Spring Balance u In tenths

REACTION

TIMES”

__End-anchor

--AB

AC

AD

AE

28 17 19

26 39 34

17 3s 39

23 31 31

of seconds.

TABLE 3 IN TEST PHASE OF EXPERI~IENI pairs BE

CE DE .________.. 2.5 ‘8 2s 42 31 ?Y 29 30 32 ___.----.__.._____.

2 Inner

Total

BC

26.0 33.6 31.0

3X 32 39

BD

pairs

CD _____ 3.5 37 28 36 36 36

Total 36.7 32.0 37.0

TRANSITIVE

187

INFERENCES

noteworthy that six of these children were in the High Salience Length group. (b) Remembering training pairs: Children in this category relied on their memory of training pairs. Six of them justified their responses to inference pairs by a classical transitive inference strategy, e.g., “A must be heavier than C, because A is heavier than B and B is heavier than C.” Five children tried to remember the premises by associating weight with darkness of color. This rule was usually amended by exceptions to fit the premises and it was used as an inadequate basis for judging inference pairs. Two children used a rule of alternation for training pairs, i.e., if item B was the heavier item on one training pair (e.g., BC) then it must be the lighter one the next time (e.g., AB). This rule, when supported by endanchoring, may have enabled children to reach training criterion. For the test phase this rule was completely inadequate. (c) Mixed strategies: Children in this category were either very unresponsive, claimed that they simply quessed the right answers, or they gave incoherent reasons. Table 4 shows the frequency of each type of explanation for the three measurement conditions: x2(4) = 15.43, p < .Ol. The pattern of frequencies reveals that children who were in the Length group more often resorted to explanations based on seriation, while children in the Weight groups tended to base their explanation on memory for training pairs. The last row in Table 4 shows the mean number of errors for the four types of explanations. The means demonstrate that reliance on memory for training pairs did not provide a solid basis for responding to test pairs: F(2,45) = 14.62, p < .OOl. EXPERIMENT

2

The most surprising result from Experiment 1 was that children in the High Salience Spring Scale group did not perform better than children in the Balance Scale groups. This is surprising, because measurement by spring scale yields information about the absolute weight of blocks and it was reasoned that the availability of absolute information made the high performance of the Length group possible. One explanation for this TABLE EXPERIMENT

2: FREQUENCY

4

OF VERBAL EXPLANATIONS PER EXPLANATION

AND

MEAN

NUMBER

OF ERRORS

Explanation Measurement

condition

Length Spring Balance Number

of errors

Seriation

Training

pairs

Mixed

12

2

4

6

2 6

5

10

I

2.0

9.8

2.9

188

PERNER,

STEINER,

AND

STAEHELIN

finding may be that children, for some reason, fail to encode the degrees of spring extension which indicate the absolute weight of blocks. Instead. children may encode the relative difference between blocks only, as in a balance scale condition. Experiment 2 was designed to test this possibility, using a paradigm introduced by de Boysson-Bardies and O’Regan (1973).

These authors gave children verbal training on two pairs, AB and CD. formed from four sticks: A, B. C, and D from longest to shortest. They found that their subjects tended to label sticks A and C as “long” and B and D as “short.” Hence when tested on all six possible pairings children used these labels to decide that “A is longer than D” and “C is longer than B” even though no objective information for judging these pairs was available. To pairs AC (both “long”) and BD (both “short”) responses were at a chance level. If this paradigm is used with visual feedback a similar response pattern can be expected when blocks are weighed on a balance scale. However, when shown blocks of different length subjects may spontaneously encode the actual length of each block. If they do, then their test responses might be less governed by a labeling strategy and therefore will correspond for all pairs to the true differences between blocks. The interesting question then is whether a Spring Scale group will also encode the amount of spring extension for each block and thus give similar test responses as a Length group, or whether they will fail to encode absolute weight information and respond like a Balance Scale group. Method Subjects

The subjects were 30 children from three kindergartens in Base]. There were six boys and four girls in each of three experimental groups. Their age ranged from 5.8 to 6.6 years, with a mean of 6.2 years. Material

and Procedure

Material and procedure were the same as for the three High Salience groups in Experiment 1 with the exception that only blocks A. B. C, and D were used and that training was restricted to pairs AB and CD. Results and Discussion A 3(measurement condition) by 6(test pairs) analysis of variance was carried out on the percentage of responses corresponding to the actual length or weight difference between items. This analysis revealed a significant two-way interaction: F( 10, 13.5) = 2.24, ,I, < .05. Table 5 shows the relevant means. As expected the data from the Balance Scale group

TRANSITIVE

TABLE MEAN

PERCENTAGE

“CORRECT”

189

INFERENCES

RESPONSES

-__

5 IN TIN

PHASE

OF EXPERIMENT

2

Test pair

Trained

New

__~

Measurement condition

AB

CD

AD

AC

BD

BC

Length Spring Scale Balance Scale

80.0 80.0 90.0

87.5 90.0 88.7

95.0 87.5 91.2

9S.0 61.2 65.0

87.0 80.0 36.0

67.0 28.5 27.5

Expected under labeling strategy

100.0

100.0 100.0 50.0 50.0 0.0 Note. Responses were scored as “correct” when they corresponded to the true difference between objects. even when subjects did not have sufficient information to make adequate judgment, i.e., new pairs in the Balance Scale condition.

show a strong labeling effect, i.e., high scores on AD but low scores on BC: < .OOl. In accordance with the labeling strategy percentage scores for pairs AC and BD are close to 50%. Also the data from the Length group reveal a labeling tendency. The difference between responses to pair AD and BC is significant: t ( 135) = 2.20, p < .05. However, as the means from Table 5 and the significant two-way interaction indicate, this tendency is much weaker than for the Balance Scale group. In particular the Balance Scale group’s means for the crucial test pairs, AC, BD, and BC lie far below the means from the Length group (smallest difference, t (27) = 2.37, p < .05). There is also evidence that the Length group has taken information about absolute lengths into account. This group’s means for pairs AC and BD lie far above the chance level of 50% (for both pairs, t (27) > 5.0, p < .OOl). The performance of the Spring Scale group resembles strongly performance of the Balance Scale group on two of the three crucial test pairs. As can be seen from Table 5 means for pairs AC and BC are practically identical for the two groups. The Spring Scale group’s mean is significantly below the Length group’s mean for pair BC (t (27) = 3.06,~ < .Ol) and for pair AC (t (27) = 2.69, !, < .02). For pair BD, however, performance by the Spring Scale group is still below but close to the Length group’s performance and significantly above performance by the Balance Scale group: t (27) = 2.67, p < .02. In summary, the results from Experiment 2 show that young children tend to label objects in a paired comparison. This tendency is largely offset in the case of length comparisons when information about the absolute length of each item is available. In this case children spontaneously remember each object’s length. However, children largely fail to t (135) = 5.12,~

190

PERNER.

STEINER.

AND

STAEHELlN

encode absolute information about objects’ weight even when appropriate information is available in form of varying degrees of spring extension. GENERAL

DISCUSSION

The subject sample of the present studies found the tasks in Experiment I more difficult than did subjects in previous studies. There were of course differences in material and apparatus which had to be adapted to the particular needs of comparing length with weight. It is, however. implausible that these changes could have made much of a difference. The High Salience Length group had conditions which closely approximated the ones for the Low Salience Length group in the study by Adams (19781. Yet performance of our sample was still below performance of Adam’s subjects who were about 3 years younger. However, it is thought that the observed age gap in the overall performance level will not invalidate the interpretation of performance differences hrrn~n experimental groups. The present results raise the question whether children’s test performance in the Bryant and Trabasso paradigm measures children’s ability to draw transitive inferences. Bryant and Trabasso ( 19711 intended their paradigm to show that children as young as 4 years old are able to draw such inferences. The work by Trabasso et al. (1975) made clear that in this paradigm transitive inferences are nut made by applying an inference rule to learned premises as was traditionally thought. Instead, knowledge that “longer than” is a transitive relation enters at the processing stage of encoding premises. Trabasso ( 1975, 1977) argued that length relationships between items are encoded as a linear spatial array of items. Implicit in this view is the assumption that children have some understanding of the transitive nature of length, for otherwise there would be no reason why they should choose a linear array for representing length. It is of central importance to this argument that the encoding of the array be the same regardless of the feedback given in training. For, if support for a linear ordering had been obtained under \~isualfeedback on!\ but not under verbal feedback then the interpretation of results would have been different. Children would have been thought to rely for their test responses on memory for actual lengths rather than on general knowledge about transitive relationships. As our introductory review of previous studies suggested, the persistent lower test performance after verbal than after visual feedback indicated that there might indeed be a difference in representation for series trained under different feedback conditions. Yet. Trabasso (1977, p. 361 f.) argued that the lower test performance after verbal feedback is due to young children’s difficulty in holding purely symbolic (i.e.. verbal) information in memory. The data from Experiment I. however, cannot be explained by this hypothesis, because t,i.srrtrl feedback was used in all conditions. Yet. the difference in rate of acquisition and test performance

TRANSITIVE

INFERENCES

191

for length versus weight series was comparable to the difference between visual versus verbal feedback for length series. Our proposed alternative explanation accounts for both findings. We hypothesized that rapid learning and high test performance is possible in only those feedback conditions in which it becomes visually apparent that each item of the series possesses the crucial attribute (e.g., length, weight) to a certuin degree. Stated like this, our hypothesis has been very carefully expressed. The condition that “feedback must display the crucial attribute in degrees” is given as an only necessnty condition for children’s rapid acquisition and high test performance. In this form the hypothesis is compatible with all existing data since there have been no consistent reports of counterinstances of feedback conditions which do not meet this condition and in which acquisition was rapid and test performance high. (A single exception was the study by Bryant and Trabasso t 197 I).) Ideally, however, one would also like to assert that the necessary condition be sufjcient for good performance: i.e., “Whenever a feedback condition does display the crucial attribute in degrees, children will encode them appropriately and then perform well on the task.” Because of this, it came as a surprise that children in the Spring Scale groups did not perform as well as children in the Length groups. Experiment 2 provided a partial solution to this problem. It was found that young children tend not to pay attention to the absolute values in spring extension whereas they do for differences in length of blocks. Thus, since subjects’ encoding of the visual experience does not capture the degrees in weight available in the Spring Scale condition, this task does not meet the hypothesized necessary condition. Hence performance in this task would now be expected to be inferior to performance of the Length groups. The data from Experiment 1 carried out this expectation. The interesting question why children fail to encode absolute values in the Spring Scale condition remains open. However, Staehelin (1980) collected data that suggest that the difficulty with the Spring Scale task lies in the fact that the weight of each object is not a permanently visible feature of the object but can only be made temporarily visible when being weighed. In her experiment children learned the series much faster when the objects’ weights were correlated with thickness (permanently visible like an object’s length) than when same-sized objects were used as in Experiment 1. Whereas performance in the length condition deteriorated remarkably when same sized “telescope” sticks were used which stretched to different lengths only when suspended (comparable to weighing same size blocks). Another question remains unanswered concerning the nature of the encoding used in those conditions where test performance was low. We suggested that because direct encoding of the visual experience would either not be attempted or would not yield a coherent representation children resort to a patchwork of inadequate strategies. Supportive evi-

192

PERNER,

STEINER.

AND

STAEHELIN

dence for this suggestion is available but only from children’s introspective reports in Experiment 1. Finally the original question whether young children can rwsotz transitively has to be raised again. If Trabasso, Riley, and Wilson’s claim were justified that all transitive attributes are uniformly encoded as a spatial array of items regardless of feedback condition then such a representation would indicate explicit knowledge that the transitivity of relationships can be captured in a linear array. In our analysis. however, children’s ability to answer test questions correctly resides in the fact that “thinking of sticks of different length” simply reflects the transitive nature of length. There is no need to assume any more explicit knowledge about transitivity. Thus Piaget’s ( 1914) contention that young children are unable to “draw transitive inferences” has to be raised again, and no definite denial can be given to date. However, the studies following Bryant and Trabasso’s paper have set the question in a much more specific context of information processing requirements. REFERENCES Adams.

M.

J. Logical

E.;perin2rnfu/

Bryant,

competence

Child

P~yc~ho/ogy,

P. E., & Trabasso,

fLondonJ.

1971,

and transitive

232,

T. Transitive

1978.

25,

inference

in young

children.

Jolcrnal

OJ~

477-489.

inference

and memory

in young

children.

Nrrtltre

456-45X.

de Boysson-Bardies, B.. & O’Regan. K. What children do in spite of adults’ hypotheses. Nnturr fLmdon). 1973. 246, 53 l-534. Furth, H. G. The operative and figurative aspects of knowledge in Piaget’s theory. In B. A. Geber (Ed.), Pitiger und Xnmt~itrg:t St&as in gctrrtic epistrnwlogy. London: Routledge ct Kegan Paul. 1977. McConigle. B. 0.. & Chalmers, M. Are monkeys logical? Natltre (London~, 1977. 267, 694-696.

Piaget. J., Inhelder. B.. & Szeminska, A. Lrr ,qeo~r~etri(~ sponttr,?er tie I’er~,jhnr. Paris: Presses Universitaires de France, 194X. Riley. C. A. The representation of comparative relations and the transitive inference task. Jortrncrl of Exprrir~un~ctl Child P,s~cholo,q. 1976. 22, I-X!. Riley, C. A., & Trabasso, T. Comparatives. logical structures, and encoding in a transitive inference task. Jortrmd of‘ E.tp,eritm~rtal Child Psycholog?:. 1974. 17, 187-203. Staehelin. C. Trunsitive S~./iIltssfi,lgc,rrrn#rn rrnd dus Problrnz der Rrprrsrrzrtrriorl bri Kindrrn im Al/w t’o,r 7 his 8 Juhren. Unpublished thesis (“Lizentiatsarbeit”). University of Base], 1980. Trabasso. T. Representation. memory, and reasoning: How do we make transitive inferences’? In A. D. Pick (Ed.). Minrrrsorcr svmposiu 011 childps~chology. Minnesota: Univ. of Minnesota Press, 1975. Vol. 9. Trabasso. T. The role of memory as a system in making inferences. ln R. V. Kail. Jr., & J. W. Hagen (Eds.), Parsprc~ti~es OH rhc de~~rlopnze~~t of‘ merno~ und cognition. Hillsdale. N.J.: Erlbaum, 1977. Trabasso. T.. Riley, C. A.. Xc Wilson. E. G. The representation of linear order and spatial strategies in reasoning: A developmental study. In R. J. Falmagne (Ed.), Reusoning: Represrntutim find proc’e.c.s irt c,hi/dren rend c&/r.\ Hillsdale. N .J.: Erlbaum. 197.5. RECEIVED:

January

4. 1980: RCV~SFI~:

September

25. 1980