Complexity of geometric inductive reasoning tasks

Complexity of geometric inductive reasoning tasks

Intelligence 30 (2001) 41 – 70 Complexity of geometric inductive reasoning tasks Contribution to the understanding of fluid intelligence Ricardo Prim...

439KB Sizes 0 Downloads 98 Views

Intelligence 30 (2001) 41 – 70

Complexity of geometric inductive reasoning tasks Contribution to the understanding of fluid intelligence Ricardo Primi* Centro de Cieˆncias Humanas e Sociais, University of Sa˜o Francisco, Rua Alexandre Rodrigues Barbosa, 45, CEP 13251-900 Itatiba, Sa˜o Paulo, Brazil Received 26 January 2000; received in revised form 30 May 2000; accepted 8 August 2000

Abstract Studies of the complexity of geometric inductive matrix items used to measure fluid intelligence (Gf) indicate that such complexity may be related to (a) an increase in the number of figures, (b) an increase in the number of rules relating these figures, (c) the complexity of these rules, and (d) the perceptual complexity of the stimulus. One limitation of these studies is that complex items present all of these characteristics simultaneously. Thus, no information regarding their relative importance is available, nor is it clear whether all these factors have a significant effect on complexity. In the present study, two matrix tests were created by orthogonally manipulating these four sources of complexity, and the results show that perceptual organization has the strongest effect, followed by the increase in the amount of information (figures and rules). These results suggest that Gf is most strongly associated with that part of the central executive component of working memory that is related to the controlled attention processing and selective encoding. D 2001 Elsevier Science Inc. All rights reserved. Keywords: Cognitive processes; Inductive and deductive reasoning; Fluid intelligence; Item content; Item response theory; Intelligence testing; Item analysis

1. Introduction Initially defined by Cattell (1941) and further elaborated by Horn and Cattell (1966), fluid intelligence (Gf) is one of the broad factors of intelligence, (Carroll, 1993a, 1993b, 1997; Horn & Noll, 1997). It is a mental activity that ‘‘involves making meaning out of confusion; * Tel.: +55-193-295-0130. E-mail address: [email protected] (R. Primi). 0160-2896/01/$ – see front matter D 2001 Elsevier Science Inc. All rights reserved. PII: S 0 1 6 0 - 2 8 9 6 ( 0 1 ) 0 0 0 6 7 - 8

42

R. Primi / Intelligence 30 (2001) 41–70

developing new insights; going beyond the given to perceive that which is not immediately obvious; forming (largely nonverbal) constructs which facilitate the handling of complex problems involving many mutually dependent variables’’ (Raven, Raven, & Court, 1998, p. G4). In the last four decades, the basic component processes that comprise this complex mental activity have been under study by various cognitive psychologists (Bethell-Fox, Lohman, & Snow, 1984; Carpenter, Just, & Shell, 1990; Embretson, 1995, 1998; Evans, 1968; Goldman & Pellegrino, 1984; Gonzales Labra, 1990; Gonzales Labra, & Ballesteros Jimenez, 1993; Green & Kluever, 1992; Hornke & Habon, 1986; Hunt, 1974; Mulholland, Pellegrino, & Glaser, 1980; Primi, 1995; Primi & Rosado, 1995; Primi, Rosado, & Almeida, 1995; Rumelhart & Abrahamson, 1973; Sternberg, 1977, 1978, 1980, 1984, 1986, 1997; Sternberg & Gardner, 1983). This research has tried to identify the cognitive processes people use to solve geometric analogy tasks which, according to Marshalek, Lohman, and Snow (1983), are the prototype tasks to assess Gf. Basically these studies (a) identify the basic component processes and the strategies that organize them in a complex chain, (b) investigate the correlations between component and traditional psychometric measures, (c) discover complexity factors underlying the tasks, and (d) simulate problem-solving behavior using artificial intelligence. Fig. 1 displays four examples of the geometric 3  3 matrix problems used in the present study. Each of these problems consists of an organized set of geometric figures obeying either two or four rules; the subject must discover them so that he or she can generalize from them to decide which of the eight options is the most appropriate to fit into the blank space. The basic components of the problem-solving behavior involved in such problems can be organized into three stages. The first stage is associated with the creation of a mental representation of the attributes of the problem and the rules relating these attributes. In the literature, these two aspects have received various labels, including encoding and inference (Sternberg, 1977), perceptual and conceptual analysis (Carpenter et al., 1990), pattern comparison and decomposition, and transformational analysis and rule generation (Mulholland et al., 1980). The second stage is associated with the recognition of the parallels between these rules and a new, but analogous, situation. This component has been variously denominated as mapping (Sternberg, 1977), perceptual and generalized conceptual analysis (Carpenter et al., 1990), and rule comparison (Mulholland et al., 1980). The third stage is associated with the application of the rules to create an appropriate representation to fill the blank, and the selection of an answer from the options provided. The terms used to denominate this process of representation generation are application, comparison and response (Sternberg, 1977), and response generation and selection (Carpenter et al., 1990). Recently, research has suggested that Gf is associated with working memory capacity (Duncan, Emslie, & Williams, 1996; Embretson, 1995, 1998; Engle, Tuholski, Laughlin, & Conway, 1999; Hunt, 1996, 1999; Jurden, 1995; Kyllonen & Christal, 1990; Prabhakaran, Smith, Desmond, Glover, & Gabrieli, 1997). According to Baddeley and Hitch (1994), working memory capacity can be decomposed into memory buffers responsible for storing speech-based information and visuospatial information (phonological loop and visuospatial sketchpad) as well as a central executive component responsible for the coordination of the basic components and attentional control. Engle et al. (1999) use the term short-term memory to denote memory buffers and suggest that they are related to the amount of information that can be maintained active at any one

R. Primi / Intelligence 30 (2001) 41–70

43

Fig. 1. Examples of experimental test items. The first and second characters represent the number of elements and number of rules, respectively; the third and fourth constitute a code representing the type of transformation (SI = simple, SP = spatial, CX = complex, CO = conceptual); and the fifth indicates the type of organization (H: harmonic, N: nonharmonic).

time; the central executive component involves the ability to maintain a representation active in the face of interference and distraction by controlling the focus of attention (controlled attention). These authors use structural equation modeling to compare the correlations of memory tasks with Gf and conclude that the central executive component drives the relationships between memory tasks and Gf. Hence, Gf is specially related to the central executive component or controlled attention.

44

R. Primi / Intelligence 30 (2001) 41–70

In the past two decades, a new approach for test construction has been developed that integrates the procedures of cognitive psychology with those of psychometric methods (Embretson, 1983, 1985a, 1985b, 1994, 1995, 1996, 1998; Frederiksen, Mislevy, & Bejar, 1993; Whitely, 1980a, 1980b, 1980c). Embretson (1983, 1994, 1998) has proposed a two-part distinction for construct validation: construct representation that involves the identification of cognitive components underlying task performance, and nomothetic span, which concerns the specification of the network of test score correlations with other constructs. Embretson argues that traditional methods of construct validation involve only the latter, which gives meaning to test scores by linking them with other measures (nomothetic span); whereas new advances in cognitive psychology suggested that the meaning of measures can also be established by a direct understanding of the process, strategies, and knowledge involved in problem-solving behavior for individual items (construct representation). An important aspect of construct representation is the determination of item complexity. This involves the development of a theory by proposing a cognitive model for solving items; the identification of basic capacities involved in item performance and item characteristics that poses demand to these capacities; finally items varying in these characteristics are produced. The theory is tested by comparing expected item complexity with empirical data. Although such studies are concerned with the explanation of variability of item complexity, as items are summed to produce test scores and as item characteristics are understood to exert differential demands on basic capacities, what is eventually being explained is the test score variability in reference to individual differences in basic cognitive capacities. In fact, in Item Response Theory, item complexity and ability are found on the same scale. Carroll (1993a, 1993b) proposed a similar procedure, although he called it behavioral scaling. While cognitive studies have made important findings about the nature of Gf, only a few studies use the procedure of construct representation to link item complexity with the findings of cognitive psychology. As will be discussed later, the few studies in the literature leave some important questions unanswered. In the light of new research on Gf such as that involving the search for a link between Gf and brain mechanisms (Crinella & Yu, 1999; Duncan et al., 1996; Prabhakaran et al., 1997) and that involving a possible explanation for the increase in Gf test scores in past few decades (Flynn, 1985, 1998; Neisser, 1998), an understanding of item complexity will be important to be able to identify exactly which aspect of the construct of Gf is being measured by existing tests. The present study was designed to investigate the source of the complexity of geometric matrix items using a controlled experiment to provide a solid basis for inferences about the relative importance of these components. The following section of summarizes the research already available concerning the factors involved in the complexity of geometric matrix items and links these factors to Gf capacities. The limitations of these previous studies will then be discussed and the goals of this empirical study presented.

2. Complexity factors and essential capacities of Gf Complexity factors are those features of a task that define its complexity. These features are intrinsically related to cognitive capacities that individuals must possess in order to deal

R. Primi / Intelligence 30 (2001) 41–70

45

with task demands and solve the problem that is proposed by a particular task. Each complexity factor constitutes a demand for one or more of the essential cognitive capacities that comprises Gf. In the literature, four main factors influencing item complexity are considered: (a) number of elements, (b) number of transformations or rules, (c) type of rules, and (d) perceptual organization. In this section, each one of these variables will be considered, and a link with the Gf capacities will be proposed. 2.1. Amount of information: number of elements and rules The number of elements refers to the number of geometric figures or attributes in an existing matrix problem, while the number of rules refers to the number of relationships existing among the different elements or attributes. The role of these variables in the complexity of geometric inductive reasoning tasks has been investigated mainly by Mulholland et al. (1980), and also by Sternberg (1977) and Sternberg and Gardner (1983). Mulholland et al. (1980) created four-term geometric analogies in true–false format (A is to B as C is to D) by systematically manipulating the number of elements (1, 2, and 3) and the number of transformations (0, 1, 2, and 3). They observed that when the number of elements and transformations increases, the processing time increases simultaneously beyond what a simple additive function would predict. In their words, ‘‘as both elements and transformations increase, then a solution may require substantial external memory that is unavailable, thus creating a need for alternative processing strategies that are time consuming with respect to item solution’’ (p. 265). Based on these data, Mulholland et al. proposed the idea of a memory management process for solving complex items. They also found that the most difficult items were those involving multiple transformations of single elements. In such cases, a person has to perform serial operations on stored representations, as well as storing the results and the order of the operations performed. These two variables are both associated with the amount of information that must be processed in the working memory and are consequently the basic sources of working memory load. An increased amount of information requires a larger memory buffer to hold all the bits of procedural and/or declarative information (short-term memory) involved, while simultaneously requiring a more efficacious organization of goals and encoded elements (central executive component). According to Baddeley and Hitch (1994), the working memory is a system that simultaneously combines storage and processing. The inclusion of processing in working memory breaks with the traditional concept of storage only, which was implied in the concept of short-term memory or short-term apprehension-retention (Horn, 1986, 1991; McGrew, Werder, & Woodcock, 1991). Woodcock (1990, p. 247), commenting on the digit span subtest of the WISC, affirmed that ‘‘A numbers reversed task appears to require both shortterm memory and fluid reasoning. A numbers forward task, however, is a purer measure of short-term memory than numbers reversed.’’ This affirmation has led to the conclusion that tasks that require only storage are not as closely related to Gf as are tasks that require both storage and processing. Various measures of working memory have been created including the ABC Numerical Assignment Test (Kyllonen, 1994), which requires individuals to supply the number assigned

46

R. Primi / Intelligence 30 (2001) 41–70

to C, in problems such as the following: A = C + 3, C = B/3, and B = 9. Using such tasks, Kyllonen and Christal (1990) found correlations around .88 with reasoning tasks, leading to the conclusion that reasoning seems to be little more than working memory. Both the reversed numbers task and the ABC Numerical Assignment Test, as well as geometric inductive matrix problems, require information to be encoded, retained, and manipulated or transformed in the working memory to reach a solution. For instance, in the solving of geometric matrix problems, various aspects requiring integrated storage and processing are involved. First, the problems include more than one element or attribute, and all of these attributes must be stored while visual perceptual processing is controlled. A similar kind of dual processing is required in mapping, when information in the working memory is used as a guide for perception. Second, when stored attributes prove to be irrelevant, they must be discarded, which requires a transformation of the information stored in working memory. Moreover, to create a solution, it is necessary to apply a rule stored in working memory to produce a new representation and then to store that result in working memory until the various options can be processed. Salthouse (1994, p. 536) has distinguished three components of working memory: ‘‘(a) storage capacity, reflecting the ability to preserve relevant information; (b) processing efficiency, representing the ability to perform required processing operations rapidly; and (c) coordination effectiveness, corresponding to the ability to monitor and coordinate simultaneous activities.’’ These components can be organized into two general capacities: structural (storage capacity) and operational (processing efficiency and coordination effectiveness). Salthouse, Babcock, and Shaw (1991) and Salthouse, Legg, Palmon, and Mitchell (1990) have created numerical, verbal, and visual measures of working memory by manipulating variables associated with structural and operational capacities. Although the main concern of these studies was the investigation of decline in working memory capacity due to age, they show that such measures show moderate correlations with Gf tests like Raven’s Progressive Matrices. Several studies (Duncan et al., 1996; Embretson, 1995; Engle et al., 1999; Kyllonen, 1994; Kyllonen & Christal, 1990; Mulholland et al., 1980; Woodcock, 1990) have presented evidence that the most important facet of working memory associated with Gf is the central executive component (Salthouse’s, 1994, operational capacity). In simpler tasks where the memory load is within the average maximum number of bits of information that can be maintained active, it may be that only the short-term memory buffer is involved, but in complex tasks, such as most types of inductive geometric matrix items, these limits are overstepped, and a strategy for dealing with memory overload must be implemented. Such a strategy will result in the implementation of a complex mental activity by the central executive component of working memory, which is responsible for the assembly of numerous elementary comparison processing loops (Klauer, 1990; Marshalek et al., 1983; Snow, Kyllonen, & Marshalek, 1984). The importance of the existence of these loops in the solution process for Raven’s Advanced Progressive Matrices was identified by Carpenter et al. (1990) by analyzing the participants’ eye movements while they were solving the problems. They noted that the analytical decomposition of the problem into smaller subproblems, with the incremental organization of the solution to each subproblem into a global strategy, was the most

R. Primi / Intelligence 30 (2001) 41–70

47

noticeable facet of the solution process. This process requires the ability ‘‘to successfully generate and manage their problem-solving goals in working memory. . . . The process of spawning subgoals from goals, and then tracking the ensuring successful and unsuccessful pursuits of the subgoals on the path to satisfying higher level goals’’ (Carpenter et al., 1990, p. 428). This global process was called goal management. One can consider the organization of goals in a hierarchical structure to be a strategy for dealing with the limited capacity of working memory because, as Carpenter et al. (p. 428) have observed, ‘‘goal management enables the problem solver to construct a stable intermediate form of knowledge about his or her progress’’ making it possible to keep a representation active in face of interference, as Engle et al. (1999) have argued. Kyllonen and Christal (1990, p. 428) also recognize the importance of a reasoning strategy in working memory when they propose that ‘‘an additional important determinant of working memory capacity is the degree to which buffer storage can be managed through a kind of reasoning process.’’ Another important reasoning strategy to cope with working memory load that has been found to be associated with Gf is adaptive flexibility. Bethell-Fox et al. (1984) have demonstrated that analogical problems can be solved by two strategies: constructive matching and response elimination. Constructive matching refers to the creation of a mental representation of an answer to a problem and its comparison with existing options, while response elimination refers to loops involved in the creation of partial solutions, usually based on a single attribute, and the elimination of incorrect options. These authors found that constructive matching was used more frequently than response elimination for simple items and by subjects with high ability; and they concluded that working memory demands caused a strategy shift. Hence, one important aspect of Gf that may also be related to the central executive component of working memory is the flexibility to alternate from one strategy to another in order to handle increased working memory loads when problems become more complex. In summary, an increase in the amount of information (number of elements and rules) will constitute an increased demand on the structural and functional working memory components, that is, the size of the working memory buffers and the implementation of strategies to organize and optimize information in the available space, respectively. This distinction is similar to that made in the working memory studies of Just and Carpenter (1992) and Salthouse et al. (1990, 1991). As the amount of information in complex tests almost always surpasses the structural capacity of working memory, the most important resource is the central executive component, since this is responsible for the organization of the flux of information to cope with memory overload. 2.2. The nature of relationships: type of rules Type of rules is another source of item complexity considered in the literature; this refers to the nature or content of the relationships or transformations applied to elements or attributes (see Fig. 2). In their study of Raven’s Advanced Progressive Matrices, Carpenter et al. (1990) classified problem rules using an adaptation of the Jacobs and Vandeventer (1972) taxonomy, which defines 12 categories, based on an analysis of 201 intelligence tests. In Fig. 2, in the first three columns, three different taxonomies are presented. The first column presents the

48

R. Primi / Intelligence 30 (2001) 41–70

Fig. 2. Examples of values for type of relationship and the link to previous research.

taxonomy used in the present study, while the second, presents that of Carpenter et al., and the third that of Jacobs and Vandeventer. The final column displays examples of each type of rule identified in these studies. In this column, each row with three geometric figures exemplifies one particular type of transformation.

R. Primi / Intelligence 30 (2001) 41–70

49

The taxonomy of Carpenter et al. (1990) posits four main types of rules: (a) quantitative pairwise progressions involving the increment or decrement of an attribute between adjacent elements; (b) figure addition and subtraction, involving the production of an element by the addition or subtraction of the other two elements; (c) distribution of three values, in which elements are instances or values of a conceptual attribute; and (d) distribution of two values, in which element subparts appear in only two of the three elements in the row. Those who are familiar with the Carpenter et al. taxonomy will note that this system includes an additional, constant in a row category, which refers to cases where no attribute change between elements is involved. However, since this type of rule was not used in the present study, it was not included in Fig. 2. The Jacobs and Vandeventer (1972) taxonomy, on the other hand expands the quantitative pairwise progression and uses different names for the other three. In the present study, these rules were reorganized (column 1 of Fig. 2). The first two levels separate the quantitative pairwise rules involving increment or decrement transformations (simple) from those involving spatial transformations (spatial). The third level (complex) includes figures addition and subtraction, distribution of three values, and distribution of two values, and attribute addition. The fourth level (conceptual) is actually a subset of the third involving an expansion of the distribution of three values to simplify computer manipulation (see Materials and Procedure). The effect of type of rule on the complexity of geometric analogies was demonstrated by Whiteley and Schneider (1981), and on the complexity of matrix items by Carpenter et al. (1990), Embretson (1998), Green and Kleuver (1992), and Hornke and Habon (1986). The Carpenter et al. analysis divided these rules according to their level of abstraction, from the easiest to the hardest: constant in a row, quantitative pairwise progression, figure addition and subtraction, distribution of three elements, and distribution of two values. In items involving pairwise progression rules, inference only requires basic perceptual comparison of two elements in order to induce a rule that can be generalized to the other elements. In items involving other type of rules, inference requires the simultaneous consideration of all elements in order to induce a rule. At the same time, this rule must be based on conceptual similarity rather than perceptual similarity. For instance, when an individual has to discover that there are three shapes when the three figures in a row of a matrix are different, they must be considered as instances of a single concept because, on a concrete level, they are perceptually different. Carpenter et al. (1990) have found evidence that individuals employ serial rule induction and consider simpler rules before more complex ones. This would imply that, for items composed of abstract rules (e.g., distribution of two values), individuals would first try to solve the problem by searching for simple relationships (perceptual similarities) before considering more complex ones (conceptual similarities). It is interesting to note that, in such a serial induction model, the amount of information (number of rules) is always correlated with the type of rule; even though an item involving only the single complex rule ‘‘distribution of two values,’’ for example, individuals will have to try at least four rules before arriving at the correct rule (constant in a row, quantitative pairwise progression, figure addition and subtraction, and distribution of three elements). Embretson (1998) thus proposes that different types of rules, which she called levels of relationship, will impose differential

50

R. Primi / Intelligence 30 (2001) 41–70

demands on the working memory, hence the type of rule will impose the same demand on Gf capacities that an increase in the amount of information will. In addition to the fact that type of rules is linked to information load, it is also linked to abstraction. Abstraction makes it possible to construct representations based on analytically decomposed fragments of perception thus allowing the reorganization of natural groupings formed by the concrete characteristics of a stimulus. In terms of cognitive processing, abstraction may be a result of the process of selective attention, associated with the central executive component of working memory. Hence, selective attention plays a necessary role in the creation of abstract representations; the two should be considered as aspects of a single unitary concept. Selective attention has been divided into three distinct categories depending on the direction of the flow of information (Sternberg, 1986). When information must be activated in long-term memory for transferal to working memory, the process is called selective comparison; when it moves from stimulus to working memory, the process is called selective encoding; and when it is selected from the working memory itself, the process is called selective combination. This distinction is important, because different types of analogical tasks are classified according to the type of selective attention required. Geometric analogies require largely selective encoding, while verbal analogies require selective comparison and deductive tasks, selective combination. Depending on the type of rule involved in such geometric problems the demand for selective encoding will be greater because of the presence of various irrelevant attributes that must be ignored in the inference process. For example, when figures with the same color are grouped, the difference in shape must be ignored. Furthermore, complex items in which elements are grouped by conceptual instead of perceptual similarity require the continued activation of concepts while attention is focused on selected parts of stimuli. This process may generate an additional demand for the controlled attention of the executive component of working memory. 2.3. Perceptual organization The final variable, perceptual organization, has been the focus of the least study. It involves the gestalt principle of perceptual grouping of visual perceptions, such as grouping by proximity, similarity, common region, and continuity (Mack, Tang, Tuma, & Rock, 1992; Palmer, 1992; Rock & Palmer, 1990). These principles can either increase or decrease the complexity of the problem, as was demonstrated by Primi (1995) and Primi and Rosado (1995). Perceptual organization is related to ambiguity, contradiction among perceptual and conceptual groupings, and the number of misleading cues. As can be seen in the sample items in Fig. 1, the two items in the first column represent items that are relatively less complex than those in the second column. The term harmony has been used here to refer to the visual esthetics of such combinations, in analogy to harmony in music, where the sound resulting from the simultaneous playing of certain musical notes is more pleasing to the ear than that resulting from other combinations, that is, harmony here is used to refer to the esthetics resulting from congruency in the arrangement of parts (Webster’s New Collegiate

R. Primi / Intelligence 30 (2001) 41–70

51

Dictionary, 1981). Visually harmonious items display perceptual and conceptual combinations that represent congruent relationships between elements, whereas nonharmonic items tend to portray competitive or conflicting combinations between visual and conceptual aspects that must be dealt with in reaching a solution. Fig. 3 helps to explain how item features can be manipulated to create these two levels of perceptual complexity. The top row reproduces the first row of item 22SIH, a harmonic item, which can be seen in full form in Fig. 1. The second, third, and forth rows show the successive manipulations that were made on the elements in the top row, to produce item 22SIN, which corresponds to the nonharmonic version of item 22SIH. This last row is thus the same as the first row of item 22SIN. The first row of Fig. 3 lists the rules that are involved in item 22SIH, that is, shape and shading transformations (two pairwise quantitative rules, see Fig. 2). The second row shows the result of the transformation of the second element into a form similar to that found in the first element (circles). This manipulation increases the likelihood of perceptual groupings based on similarity of form. The third row shows the result of the second phase of

Fig. 3. Example of transformations applied to harmonic items to create nonharmonic items.

52

R. Primi / Intelligence 30 (2001) 41–70

manipulation, which added color to the first element. This manipulation was also designed to increase the likelihood of perceptual groupings based on similarity of ‘‘color.’’ The fourth row shows the result of the third phase of manipulation, which interrupts the alignment of the elements. This transformation interrupts the natural perceptual continuity of the elements, which makes it difficult to identify which elements should be grouped. Similar manipulations were used on the second and third rows of item 22SIH to produce item 22SIN. It is important to note that, in terms of number of elements, number of rules and type of rules, these two items are identical. The difference is due to perceptual complexity. In previous studies, it has been found that such transformations have a remarkable impact on the complexity of nonverbal inductive reasoning tasks (Primi, 1995; Primi & Rosado, 1995). In their study of Raven’s Advanced Progressive Matrices, Carpenter et al. (1990) observed the existence of certain misleading cues that increase the complexity of the process of finding a correspondence among elements or of grouping elements that are governed by the same rule. This complexity was found in items composed of multiple rules, probably because the involvement of several rules also implies the presence of several superposed elements that form perceptually complex figures. In perceptually complex items, the likelihood of the formation of irrelevant groups of elements based on perceptual features is increased. Hence, such items impose demands on selective encoding and abstraction, because certain perceptual groupings must be ignored and others based on more abstract attributes considered. They also impose demands on goal management associated with the central executive component, as the irrelevant groupings will have to be discarded, and it will be necessary to operate on the stored representations or in fragments of the perceptual field that have proved to be irrelevant. Moreover, they may impose demands on the visual memory component of working memory (visual scratch pad) because a discontinued group of elements implies the need to remember their positions in visual space to orient the subsequent selective encoding process through the indication of the remaining elements that needs to be considered.

2.4. The differential effect of complexity factors In summary, there is some evidence that all complexity factors of geometric matrix inductive tasks have an effect on the central executive component of working memory. Hence, in general, individual differences in Gf are closely related to the capacities associated with this component of working memory. Although all complexity factors affect the same component, the manner in which they do so appears to be different. It may be possible to identify two groups of complexity factors, one involving the number of elements and number of rules and the other perceptual organization and type of rules. The first group encompasses variables correlated with the process of handling simultaneous bits of information in short term memory (goal management), while the second is composed of variables correlated with the simultaneous control of visual processing of selective encoding (abstraction) and the management of information in short-term memory. The second group may also be associated with goal management when irrelevant information is present.

R. Primi / Intelligence 30 (2001) 41–70

53

No evidence for a differential impact from these two groups of variables is found in the literature. The Carpenter et al. (1990) analysis of Raven’s Advanced Progressive Matrices, for example, shows that item complexity is basically correlated with the number and type of rules, but these occur simultaneously in complex items. In comparison with easy items, the complex items involved more rules (usually three or four), more complex rules (distribution of two values, e.g., Fig. 2) and misleading cues complicating the process of finding correspondences. Hence, this collinearity precludes the possibility of identifying unique contributions to item difficulty. Two studies in the literature were designed to create matrix items with geometric figures controlling item features to gain insights into their effects on item complexity. The first was a study by Hornke and Habon (1986), who developed an item bank with 616 matrix items involving two elements and two rules each, that is, with no variation in the amount of information but varying in relation to the other three features (type of rule, direction of relationships, and perceptual organization). The first variable included eight levels: identity, addition, subtraction, intersection, unique addition, seriation, variation of open gestalts, and variations of closed gestalts. Since these authors created their taxonomy from the work of Jacobs and Vandeventer (1972) and Ward and Fitzpatrick (1973), all but one of the relationships is accounted for in the examples in Fig. 2. Intersection is the only new transformation, and it can be described as the inverse of distribution of two values or unique addition. Unique addition items can be solved by superimposing two elements, examining the part that does not intersect, and composing the third element with these parts. In intersection, the process is the same, except that the third element is composed of the parts that do intersect in the first two elements. The final two types of rules, variation of closed and open gestalts, correspond to distribution of three values, except for closedness (squares, circles, etc.) or openness (arrows, lines, etc.) of the elements. The second variable was the direction of possible relationships by row, by column, or by row and column, and the third was the arrangement of the elements such that (a) elements could clearly be perceptually separated components, (b) they could be integrated, that is, they consisted of two attributes of a unitary element or geometric figure, such as shape and color, and (c) they could be embedded to appear perceptually as a unitary element although separate rules are involved in the variation of the subparts. Based on the discussion of complexity factors in this paper, this variable could be considered to involve perceptual complexity. The proportion of variance in item difficulty accounted for by these variables was .40. A visual inspection of the Hornke and Habon (1986) results shows that items composed of intersection, unique addition, and with embedded elements were the most complex. The type of rule had a significant effect on complexity, but the perceptual organization of the elements had a stronger effect. The items whose elements were perceptually separate were easier than those whose elements formed a whole figure requiring an analytic process to dissociate the elements. Certain questions can be raised about this study and the low predictability achieved. First, in the taxonomy of rule types, various categories for similar types of rules were created (e.g., intersection with unique addition, addition with subtraction, variations of closed gestalts with variation of open gestalts). Second, the processing model postulated a distinct cognitive operation associated with each category in their taxonomy, but since some of these categories

54

R. Primi / Intelligence 30 (2001) 41–70

were very similar, they probably involved similar cognitive operations, therefore, the potential of the taxonomy for explaining item complexity is reduced, especially since it may not have exhausted all the variance in the complexity of cognitive operations involved in Gf. Third, and most important, the authors did not vary the amount of information involved, although this constitutes an important source of difficulty. The most relevant aspect of this study, however, was the operationalization of perceptual complexity and the demonstration of its effect on item difficulty. The second and more recent study was conducted by Embretson (1995, 1998), who based her items on the structure identified by Carpenter et al. (1990) for the Raven’s Advanced Progressive Matrix items; this structure involves variations in the number and type of rules, as discussed above. Basically, Embretson (1998) coded item structure by using two variables, the first combining number and type of rules and the second indicating the perceptual complexity of the stimulus. For the first variable, items were scored by a weighted sum of the rules involved (1 = identity; 2 = quantitative pairwise progressions; 3 = figure addition/subtraction; 4 = distribution of three values; 5 = distribution of two values). The second variable consisted of three dummy codes indicating whether the elements in an item were overlaid (object overlay), whether they were combined appearing as a unitary object (object fusion), and whether corresponding elements were perceptually altered (object distortion). Embretson observed that although this last variable did have a significant effect on item complexity, the first variable had a correlation of .71 with item complexity. Based on these results she concluded that working memory was the most important source of item complexity in Gf tests. Unfortunately, the collinearity of the independent variables makes this conclusion questionable. It is well known that in multiple regression analysis it is difficult to identify which of the independent variables is the most important in predicting the dependent variable when they are correlated (Howell, 1997). In fact, the correlation between the number of rules and the presence of the distribution of two values (the most complex rule) in the 36 items of the Advanced Progressive Matrices (the structure used in the Embretson study) is .64 (N = 36, P < .001; Primi & Castilho, 1996). As was suggested earlier, the number and types of rules may have differential impact on the central executive component of working memory, with the former more closely associated to goal management and the latter to abstraction and selective encoding. The conclusions of the study give importance to the first component although the second could also have been supported. With the data available, it is not possible to know if items are more difficult because they require a more efficient management of goals in short term memory, because they require a difficult controlled visual process of selective encoding, or because they require a combination of the two capacities. In fact, in another study using the latent response model by Maris (1995), Embretson (1995) estimated separate ability scores for working memory and general control processing. In this earlier study, the meaning attributed to working memory differs from that adopted in the present study as it was linked to the number and type of rules exerting demands on goal management. In the other hand, general control processes was linked to representational variability or ambiguity in the meaning of the item stimuli (perceptual complexity) exerting demands on strategy planning, monitoring, and evaluation. The results of this earlier study

R. Primi / Intelligence 30 (2001) 41–70

55

thus suggest a more important role for general control processing in the prediction of individual differences in Gf. In summary, all these studies suggest the importance of the role of variables associated with the amount of information (goal management), rule complexity, and perceptual organization (selective encoding and abstraction), but they differ in the importance attributed to the specific sources of item complexity. The Hornke and Habon (1986) study emphasizes the importance of perceptual organization in association with abstraction and selective encoding while the Embretson (1995, 1998) studies fail to show conclusive results because either the role of goal management and/or that of abstraction capacity in the prediction of item complexity could have been supported. It thus seems that the role of these variables is not yet fully understood. One problem is that these two studies present limitations precluding the precise identification of the effect of each source of difficulty. The former fails to control for the amount of information, while the latter fails to control for the collinearity of the structural variables. The present study was thus designed to investigate the effect of each of the four potential sources of item difficulty discussed earlier. Considering that experimental control of these variables is necessary in order to investigate the unique contribution of each, the main goal of the present study was to create an item structure that represented an orthogonal combination of these four main sources of item complexity so that their impact on item difficulty and reaction time could be identified. All four variables were expected to play a role in item complexity, moreover the variables associated with individual differences in goal management and selective encoding and abstraction capacities were expected to furnish further insights into the relative contribution of each to item complexity, and consequently, to an explanation of Gf.

3. Method 3.1. Participants The participants were 313 undergraduate students, approximately 68% females, with ages ranging from 17 to 52; about 75% were between 17 and 22 (mean = 21.9, S.D. = 5.8). 3.2. Materials and procedure Sixty-four items were developed and divided into two sets with 32 items each (Forms A and B). These forms were structurally identical and were applied to two independent subgroups of subjects (Form A: n = 122 and Form B: n = 191). Each item consisted of a 3  3 matrix with an empty cell and eight response alternatives. In the present paper, each geometric figure within a cell is considered to be a term, with the elements being subparts of a term. The cell matrix is referred to using the letters A to I, with the alternatives labeled I1 to I8. Four examples of the items used have already been presented in Fig. 1.

56

R. Primi / Intelligence 30 (2001) 41–70

The item structure is defined by four variables: number of elements, number of rules, type of rule, and perceptual organization. Each item is formed by either two or four elements, randomly selected from a pool of 59 geometric figures; either two or four rules govern the relationships between these elements. This leads to four possible combinations of number of elements and number of rules: two elements involving two rules, two elements involving four rules, four elements involving two rules, and four elements involving four rules. Except for the two elements four rules combination, in which two transformations were applied to each element, it was possible to apply a single rule to each element. Four types of rules were employed: simple (SI), spatial (SP), complex (CX), and conceptual (CO). The first column of Fig. 2 shows each of these types, called levels, while the final column shows examples of each level. Considering the Carpenter et al. (1990) and Jacobs and Vandeventer (1972) taxonomies, more than one type of rule was involved for each level. The first level includes five types of quantitative pairwise progressions, that is, increase or decrease of an attribute between adjacent elements: size, shading, number series, shape, and added element. The second level also includes three types of quantitative pairwise progressions, but these involve spatial transformations such as movement on a plane, flipping over, and reversal. Level 3a includes the four more complex transformations: figure addition and subtraction, distribution of three values, distribution of two values, and attribute addition (in which an element is composed by combination of two attributes from the other two elements). Level 3b is a subset of Level 3a, involving only distribution of three values of an attribute. The attributes involved were shading, inclination, color, size, outline, and shape. Since such transformations are suitable for automatic item generation, specific items involving these rules were created to investigate their psychometric potential for future studies concerning computerized online item generation. Combining these three variables, that is, number of elements (two or four), number of rules (two or four), and type of rule or levels (simple, spatial, complex, or conceptual), led to the creation of a basic 16-item set. Since more than one possible option was available for each rule type specific rules were randomly selected. The original 16-item set was further transformed to create two levels of perceptual organization: harmonic and nonharmonic. The two items in the first column of Fig. 1 are examples of harmonic items (22SIH, 42COH), and the two items in the second column their corresponding nonharmonic transformed versions (22SIN, 42CON). The transformation employed, exemplified for item 22SIH (Fig. 3), was similar to that used in previous studies, in which elements are arranged according to gestalt principles of perceptual grouping so that specific perceptual correspondences among elements are produced (Primi, 1995; Primi & Rosado, 1995). For the nonharmonic items, irrelevant perceptual correspondences were created by manipulating the principles of similarity and continuation (Palmer, 1992; Rock & Palmer, 1990). Manipulating attributes of noncorresponding elements such as color and shape leads to a perceptual tendency to group them according to similarity. These groupings, based on the perceptual process do not conform to any meaningful rule in the problem (e.g., the group composed by all black elements in item 22SIN of Fig. 1), and constitute misleading cues. Simultaneously, altering the relative positions of corresponding elements across a row made it possible to increase the complexity involved in forming relevant

R. Primi / Intelligence 30 (2001) 41–70

57

groups due to the interruption of natural perceptual continuity, which would have facilitated their grouping. This particular transformation may increase the demands on the visual component of the working memory, because the formation of relevant groups of elements involves the storage of complex, nonsystematic visual patterns of element positions in visual space. For the harmonic items, these same principles were used to create a perceptual tendency to group elements according to the appropriate conceptual rule, that is, corresponding element attributes such as shape and color were the same and different from other noncorresponding elements; and, corresponding elements were always aligned in space forming groups by their good continuity (cf. items 42COH with 42CON in Fig. 1). The seven distractors were created in such a way as to have: (a) two alternatives with only one incorrect transformation, (b) two alternatives with two incorrect transformations, (c) two alternatives with more than two incorrect transformations, and (d) one alternative that was a copy of the term presented in cell H. The distractors and the correct alternative were randomly assigned to the eight positions I1 to I8. The 32 items in each form were arranged according to difficulty. It was assumed that (a) a nonharmonic version would be more complex than a harmonic one, (b) rule type would decrease in complexity from CO–CX to SP to SI, (c) items with more information (involving more elements and rules) would be more complex than items with less information, and (d) the effect of perceptual organization would be relatively greater than the effect of the type of rule. The basic 16-item sets were presented once in the harmonic version and a second time in a nonharmonic version. If items were always presented in this order, however, subjects would be better prepared to answer the nonharmonic items and learning could interfere with the effect exerted by perceptual organization, thus influencing the results. The item order was thus controlled, with half of the subjects receiving in the harmonic–nonharmonic order and the other half the nonharmonic–harmonic order, with the subjects being randomly selected for each of two groups. Each item was drawn using Corel Draw 4.0 and exported to a 640  480 pixel bitmap file format. Computer software was developed using Microsoft Visual Basic 4.0 Professional (Microsoft, 1995a) to manage item presentation and store the responses in Access database files (Microsoft, 1995b). In the application sessions, the subjects answered simultaneously in computer laboratories with 20 work stations (PC-486). Each session was conducted by a trained experimenter who was a psychology undergraduate student participating as part of the work required to obtain course credits. The duration of each testing session was about 60 min. In a typical session, students first received instructions by means of a hypertext presentation, which included the following: (a) a brief definition of inductive reasoning, (b) an example of the task format, (c) explanations about how to interact with the computer, and (d) general orientations concerning the number of items and the existence of only one correct alternative. After these instructions, the subjects completed three practice items before starting the experimental test. Initially, only the first eight cells (ABC, DEF, GH) were presented. To see the response alternatives, the subjects had to click a button called ‘‘Present the Alternatives.’’ To choose an

58

R. Primi / Intelligence 30 (2001) 41–70

alternative, they had to click on it. Clicking on an alternative moved it to the blank space (the ninth cell I). The subjects could, however, change their minds by clicking on the alternative now occupying this space, which returned the alternative to its original location. This ‘‘unselecting’’ of a response was called correction response (CR). Subjects could also eliminate alternatives by clicking on them with the right-hand mouse button, making the alternative disappear, although the alternative eliminated could be recuperated by clicking on the space again with the right-hand mouse button. This reaction was called an elimination response (ER). After terminating the item, subjects had to click on a button called ‘‘next item’’ to proceed to the next item, but this button was only enabled after the subject had chosen some alternative for the previous item, which effectively prevented the skipping of any item. The right-hand corner of the screen contained an indication of the number of items answered and those remaining. Basically all responses were recorded, including the reaction time elapsed from the time of item presentation to the time of the response precise to the level of milliseconds These measurements were obtained by using the timegettime function included in the library file MMSYSTEM.DLL. 3.3. Design The main goal of the present experiment was to investigate the importance of each structural variable on complexity. Each item is a cell containing a specific combination of four independent variables: (a) number of elements (two levels), (b) number of rules (two levels), (c) type of rule (four levels), and (d) type of organization (two levels). The dependent variables for each of the combinations were reaction time (RT), accuracy (P), and response elimination (ER). Since each subject answered under all conditions, the design is one of repeated measures. Moreover, the application of two parallel forms to the independent groups functioned as a cross validation study.

4. Results 4.1. Psychometric properties Table 1 shows the descriptive statistics of total and item scores for Forms A and B separately. It also shows the distribution of item difficulty and point biserial correlations between item scores and total scores. As expected, item difficulty varied from easy (.89) to complex (.05), with the total set of 64 items forming a representative sample of a broad spectrum of item complexity. The correlations between item and total scores varied from moderate to high. Out of the total of 64 items, 55 presented item–total point biserial correlations higher than .30. These indices indicate that the items formed a coherent group along a homogeneous scale. Internal consistency coefficients were high (.84 and .85); the two forms that were structurally similar, showed equivalent psychometric properties, although Form B was slightly easier than Form A.

R. Primi / Intelligence 30 (2001) 41–70

59

Table 1 Descriptive statistics for total and item scores in test forms A and B Descriptive statistics for total scores

Form A

Form B

Mean S.D. Min Max K-R 20

17.45 6.04 2 30 .84

18.94 5.90 1 30 .85

Descriptive statistics for items

P

rpb

P

rpb

Mean S.D. Min Max

.54 .19 .05 .89

.41 .10 .20 .61

.59 .21 .06 .85

.41 .10 .11 .62

Frequency distributionsa

Frequency (%)

Frequency (%)

Frequency (%)

Frequency (%)

> .10 .10 – .20 .21 – .30 .31 – .40 .41 – .50 .51 – .60 .61 – .70 .71 – .80 >.81

1 (3.1) 4 (12.5) 12 (37.5) 11 (34.4) 4 (12.5) 1 (3.1)

1 1 2 1 4 4 8 6 5

a

1 5 6 7 5 3 4

(3.1) (15.6) (18.8) (21.9) (15.6) (9.4) (12.5)

(3.1) (3.1) (6.3) (3.1) (12.5) (12.3) (25.0) (18.8) (15.6)

1 (3.1) 4 (12.5) 6 (18.8) 15 (46.9) 5 (15.6) 1 (3.1)

Difficulty index for P and point biserial correlations between items scores and total scores for rpb.

4.2. Item complexity analysis The complexity analysis focalized the Rasch difficulty index as the dependent variable; this was estimated by the unconditional maximum-likelihood method proposed by Wright and Stone (1979) and performed using RASCAL software (Assessment Systems Corporation, 1996). About 85.9% of the items fitted the one-parameter model. Because of the variability in the item–total correlations, the two-parameter model could have been used to provide an even better fit. However, since the correlation between the difficulty indices obtained for the two models was very high (.98), the simpler one-parameter model was used. Difficulty indices were analyzed using the general linear model approach. Independent variable effects were represented by a set of orthogonal linear contrasts (Cohen, 1968; Howell, 1997), which represented the effects of the structural variables (number of elements, number of rules, type of rule, and perceptual organization) and the test form. The goal of this analysis was to predict the item difficulty on the basis of these independent variables using a stepwise multiple regression technique. All structural variables were expected to make significant contributions, whereas the test form was not expected to make a significant contribution. Table 2 presents the results of the stepwise multiple regression displaying the nonstandardized regression coefficients (B), the standard error (S.E.), the standardized regression

60

R. Primi / Intelligence 30 (2001) 41–70

coefficient (b), and the squared multiple correlation (R2). In the first analysis, including all the 64 items, only perceptual organization contributed significantly to the prediction of item difficulty. The proportion of variance in item difficulty accounted for by the perceptual manipulations intended to create irrelevant correspondences was .408, which was statistically significant, F(1,62) = 43.58, P < .0001. Although perceptual organization had a large effect, more than half of the variance in item difficulty remained unexplained. A detailed look at the item matrix data revealed two possible sources of interaction: (a) interaction between number of elements and rules with type of rules and (b) number of transformations with number of elements. The effect of the number of elements and number of rules seemed to be stronger for items involving simple, complex, or conceptual rules than for those involving spatial ones. Also, items in which more than one transformation occurred in a single element seemed to be more difficult. Based on these observations, a second analysis was performed in which the spatial items were excluded and the contrasts for number of elements and number of rules was replaced by a new contrast representing the sum of these effects plus an effect representing the number of rules applied to a single element. This new contrast has been denominated ‘‘amount of information.’’ The second analysis (Table 2) shows that, in Step 1, with perceptual organization included in the equation, R2=.534, Finc(1,45) = 51.60, P < .0001. Then, Step 2, with amount of information included in the equation, yielded R2=.642, Finc(1,44) = 13.29, P < .001. This second analysis, which basically excluded the spatial items, revealed a greater effect for perceptual organization, as well as a significant increase of approximately 11% in the predictability of item complexity attributable to the amount of information. None of the other variables, including the test form, made a significant contribution. 4.3. Reaction time analysis The RTs of the 313 subjects answering the 32 items (10,016 observations) varied from 4.62 to 1031.21 s (mean = 79.48, S.D = 64.86). The distribution of these RTs was positively skewed, so prior to the analysis, the RT was transformed by the natural logarithmic function. Table 2 Summary of results of regression analysis predicting item difficulty from structural variables Structural variables

B

S.E.

b

.675

.103

.639***

.794

.111

.731***

.785 .216

.098 .059

.723*** .329***

a

First analysis Perceptual organization Second analysisb Step 1 R2=.534 Organization Step 2 R2=.642, DR2=.108*** Perceptual organization Number of elements + number of rules + number of rules on same element a

64 items, R2=.408. 48 items (items with spatial rules excluded). *** P < .001. b

R. Primi / Intelligence 30 (2001) 41–70

61

A 3  2  2 ANOVA was performed, with the logarithm of the RT as the dependent variable, and the amount of information (three levels), the type of rule (four levels), and the perceptual organization (two levels), as independent variables. The levels of the amount of information included: (a) Level 1, for items with two rules and two elements, (b) Level 2, for items with four elements and two rules, and (c) Level 3, for items with four rules (either two or four elements). Table 3 shows the results of the ANOVA. All the main effects and interactions were statistically significant, but their magnitude varied considerably. The RT depended primarily on the individual subject, that is, on a general between-subject facet representing individual differences in the mean RT for answering all 32 items. The proportion of the total RT variance explained by the between-subject source was .367. The next most important effect was due to perceptual organization. Harmonic items required an average of 59.71 s, whereas nonharmonic items required an average of 91.65 s. The proportion of the variance in total reaction time accounted for by this variable was .169. Another important effect was due to the interaction between type of rule and perceptual organization. Harmonic items with simple transformations required an average of 47.12 s, while conceptual transformations required 52.49 s, spatial transformations required 68.33 s, and complex ones an average of 70.90 s. But for nonharmonic items, these differences were not as pronounced as for their harmonic counterparts. The proportion of the total RT variance accounted for by the interaction between perceptual organization with type of rule was .035. Fig. 4 shows the mean RT for each combination of independent variables. It seems that increasing the amount of informaTable 3 Results from 3  2  2 (Amount of information  Type of rule  Type of perceptual organization) repeated measures ANOVA for RT Source of variance

S.S.

df a

Between subjects Persons

1138.01

312.00

Within subjects Type of perceptual organization 330.19 (280.93) Type of rule 90.42 (214.64) Amount of information 22.63 (85.89) Type of rule  Type 69.09 (202.98) of perceptual organization Amount of information  Type 23.18 (75.99) of perceptual organization Type of rule  Amount of information 14.81 (278.78) Type of perceptual organization  Type 6.29 (262.48) of rule  Amount of information Total (within subjects) 1958.29

1 (312) 2.73 (852.73) 1.98 (616.70) 2.66 (831.33)

M.S. 3.65

F

h2 .367

330.19 (0.90) 366.71*** .169 33.09 (0.25) 131.44*** .046 11.45 (0.14) 82.22*** .011 25.93 (0.24) 106.19*** .035

1.91 (597.35)

12.10 (0.13)

95.16*** .012

5.06 (1578.3) 5.35 (1669.35)

2.93 (0.18) 1.18 (0.16)

16.57*** .007 7.47*** .003

Values within parentheses correspond to the sum of squares (S.S.), degrees of freedom (df), and the mean squares (S.Q.) of the error source, respectively. a The degrees of freedom were corrected using the Greenhouse – Geisser formula to compensate for compound symmetry violation (Howell, 1997). *** P < .001.

62

R. Primi / Intelligence 30 (2001) 41–70

Fig. 4. Mean RT for each item classified according to structural definition, amount of information, type of rule, and perceptual organization.

tion and rule complexity exerts a more systematic effect for harmonic items then for nonharmonic ones. 4.4. The analysis of elimination The computerized application of the test made it possible to identify all the eliminations made while the subjects were taking the experimental test. Table 4 shows the descriptive statistics for the number of eliminations for each of the 32 items answered by each of the 313 subjects. There was great variability in the use of this resource. Some students did not use it at Table 4 Descriptive statistics related to elimination of responses, calculated for each item (by item) and for each student (by student; average of responses to all 32 items) By item

By student

Number of observations Total Total number of items or students where at least one elimination occurred

10,016 2318 (23.1%)

313 132 (42.2%)

Descriptive statistics Mean S.D. Min Max

4.87 1.60 1 7

82.47 55.23 1 224

The descriptive statistics were calculated for all observations where at least one elimination occurred.

R. Primi / Intelligence 30 (2001) 41–70

63

Fig. 5. Scatterplot of subjects classified according to ability and number of eliminations for 32 items.

all, while others used it frequently. Those who did use the resource also revealed great variability. The internal consistency coefficient was high .97. The correlation between the number of eliminations and ability was r =.51, N = 313, P < .001, indicating that the subjects who used this resource more frequently tended to have higher scores. Fig. 5 shows a scatterplot classifying each student in a two-dimensional space defined by the coordinates of ability (WIT scale, Wright & Stone, 1979) and the total number of eliminations made for the 32 items. The subjects located to the right (i.e., those making greater use of the strategy of elimination) tended to have higher scores, although subjects not using eliminations so frequently showed greater variation in total score. This suggests that high ability was not necessarily associated with the elimination of responses but individuals who used this resource frequently tended to have higher scores possibly because this strategy may have served to reduce the information overload.

5. Discussion Geometric inductive matrix items such those found in Raven’s Advanced Progressive Matrices constitute markers of the assessment of Gf. Cognitive psychological studies have pointed out that item complexity is associated with (a) an increase in the number of figures, (b) an increase in the number of rules relating these figures, (c) the complexity of these rules, and (d) the perceptual complexity of the stimulus. One limitation of these studies, however, is

64

R. Primi / Intelligence 30 (2001) 41–70

that complex items present all of these characteristics simultaneously. Thus, no information regarding relative importance is furnished, nor is it clear whether all these factors actually have a significant effect on complexity. Since each feature may relate to a different aspect of the information processing of Gf the variables were combined orthogonally in the present study so that their effects could be investigated more precisely. Classical psychometric properties indicate that the two experimental tests developed here constituted good measures of Gf. Moreover, they included problems from a wide range on the complexity continuum. The results obtained here support the systematic use of cognitive psychology in test development, as was proposed by Embretson (1994, 1998), since this produced a sound psychometric measure, while simultaneously providing an enhanced understanding of the cognitive processes associated with item performance. The major contribution of this study involves the identification of the most important sources of difficulty in test items that contribute to the construct representation of Gf. Two variables contributed significantly to an increase in item complexity: perceptual organization and the amount of information, a variable created by combining number of elements, number of rules, and number of rules applied to a given element. The study suggests that the most important effects are due to perceptual organization, which explains 53.4% of the variance in item complexity. Contrary to Embretson (1995, 1998), the results obtained here do not emphasize the prevalence of the role of goal management component of the central executive component, but rather show that abstraction (associated with selective encoding) is a major aspect affecting item complexity. As was discussed earlier, the variable used to predict item complexity by Embretson (1998) confused type of rules (associated with abstraction capacity), and number of rules (associated with goal management). Although she interpreted the effect of this composite variable as emphasizing the notion of overload of information associated with the number of rules, as this variable was correlated with type of rule, another interpretation of her results is possible, which emphasizes the need for a more abstract inference due to rule complexity. Based on the results of Carpenter et al. (1990), Embretson (1998) postulated that rules would be inferred serially, from simple to complex. This conception led her to suggest that rule type would have the same cognitive impact as the number of rules, that is, efficacious goal management would be required to cope with working memory overload. In the present study, however, type of rule seemed to be related to the need for abstract processing. Although it did not have a significant effect, perceptual organization was postulated as a variable to produce similar cognitive demands and was found to have the strongest effect on complexity. Thus, the results of the present experiment strongly support the importance of the process of abstraction in Gf. But one limitation of the study concerns the variable ‘‘type of rule.’’ If abstraction is the most important ability involved in item solving, and considering that perceptual organization and type of rules are item features that operationalize item demands for this ability, why did only perceptual organization have a significant effect? One possible explanation would be that a great variability might have existed between rules comprising each general type in the taxonomy developed for the present study. This interpretation is supported by the fact that the average reaction time for items involving conceptual rules (Level 3b, in Fig. 2) was similar to

R. Primi / Intelligence 30 (2001) 41–70

65

that observed for simple items. Therefore, this specific rule, which was considered to be as complex as the other rules comprising Level 3a (addition of attributes, distribution of two, addition of elements), turned out to be much simpler than expected. Further studies are needed to distinguish within each type of rule and combine these individual rules with other variables to clarify their effects. This limitation, however, does not preclude the importance of the effect of perceptual organization. Perceptual complexity was more homogeneously defined than the levels of type of rules and this could have been responsible for the great effect of this variable. Due to the factorial combinations of the independent variables, the main effect that was found for perceptual organization can be generalized for items with varying amounts of information, and for items with different types of rules, except for those involving spatial pairwise progressions. Thus, the effect of perceptual complexity can theoretically be generalized to a wider universe of matrix items. Increasing the complexity of perceptual organization complicates the encoding of the attributes of a problem, thus making the creation of a stable mental representation more difficult. Certain types of stimulus organization induce the formation of irrelevant groups of elements or attributes, thus requiring more controlled attentional processing of selective encoding for the flow of information to the working memory to make focus on abstract relationships possible, while ignoring the concrete attributes that appear simultaneously in the field of perception. An analytical approach based on the control of attention might help to reduce the overload of information in working memory, since limiting consideration to relevant attributes reduces the load caused by irrelevant information. At the same time, such an approach might help a subject to consider one attribute at a time, thus preventing overload on and confusion in working memory when various attributes must be considered for a given item. The relationship between systematic approach and ability is shown in the use of the strategy of elimination of alternatives. This may be interpreted as a physical analog of the attention control process. Since the elimination of alternatives may be based on the selection of a relevant attribute and the ignoring of irrelevant information in the visual field, thus reducing the amount of information that must be considered. Moreover, the RT analysis presents evidence that more complex items (more perceptually ambiguous) generally overload the processing system, consequently requiring more processing time. This additional time can be associated with the extra time needed for processing the irrelevant information. The importance of selective encoding associated with the capacity for abstraction also suggests that visual processing and the corresponding visual scratch pad of working memory may be important components of Gf. This interpretation is coherent with factor analytic studies that have shown the broad Gf factor to be associated with the other broad visual processing factor (see Carroll, 1993a, 1993b, for a complete review). It is also important for studies trying to explain the rise observed in intelligence test scores, particularly, on Gf tests, since one of the hypotheses being considered is that this increase may be associated with increases in exposure to visual stimulation in recent years (Flynn, 1998). Moreover, such encoding is also cited in relation to age related loss in working memory, which is linked to

66

R. Primi / Intelligence 30 (2001) 41–70

difficulties in encoding and retention of relevant information, since operational capacities appear to be unchanged in older adults (Salthouse et al., 1990, 1991). A second source of complexity, involving less impact but contributing a significant 10.8% to the explanation of the variance in item complexity is the amount of information that must be encoded and processed in order to solve a problem. The difficulty arises essentially from pressure on the working memory capacity, that is, the difficulty involved in processing several items of information simultaneously. Various studies have stressed the role of the working memory in the cognitive interpretation of Gf (Carpenter et al., 1990; Duncan et al., 1996; Embretson, 1995, 1998; Kyllonen & Christal, 1990; Mulholland et al., 1980). The present study supports this interpretation and suggests that Gf is strongly related to a specific aspect of the central executive component of working memory. The most complex tasks of Gf tests require the capacity to control selective encoding in visual processing simultaneously with the management of the information in short term memory to prevent loss of information due to overload. These results are also in agreement with Engle et al. (1999), who showed that the general control process was responsible for the high correlations between working memory and Gf tasks. In their words: the critical factor common to measures of working memory capacity and higher level cognitive tasks is the ability to maintain a representation as active in face of interference from automatically activated representations competing for selection for action and in the face of distractions that would otherwise draw attention away from the currently needed representation (p. 312).

In summary, the present study offers evidence that a very important aspect of Gf is the abstraction capacity associated with the process of selective encoding. It also corroborates past findings that the general control process of goal management, which organizes a hierarchical flow of information to the working memory to compensate for natural limitations in dealing simultaneously with numerous bits of information, is another important aspect of Gf. Perhaps the most important contribution of this study is the identification of item features that produce specific demands for each one of these capacities, as well as the provision of a method for altering these features operationally so that more carefully controlled tests of Gf can be produced. Acknowledgments This paper is based on the author’s doctoral dissertation, submitted to the University of Sa˜o Paulo (Institute of Psychology) under the orientation of Adail Victorino Castilho. The research was financed by the Brazilian National Research Council (CNPq). The author acknowledges the contributions of Gerardo Andanez Prieto, Ronald K. Hambleton, Leandro S. Almeida, and Linda Gentiy El-Dash for their helpful comments on the draft of the manuscript, as well as Claudine´ia Ap. Ferreira de Godoi Veiga, Romilda Simo˜es de Queiroz, Roseli Filizatti, Tristana Cezaretto, Erika S. de Souza Barboza, Cristiane Jardim Girioli, Rosaˆngela Scrich, Kelly Fiorelli Ferro, and Jose´ Maurı´cio Haas Bueno for their valuable assistance in the collection of the data. The author is especially grateful to Robert J. Sternberg

R. Primi / Intelligence 30 (2001) 41–70

67

who contributed invaluable guidance during the author’s stay at Yale University during the fall semester of 1997.

References Assessment Systems. (1996). User’s manual for the MicroCat Testing System. St. Paul: ASC. Baddeley, A. D., & Hitch, G. J. (1994). Developments in the concept of working memory. Neuropsychology, 8 (4), 485 – 493. Bethell-Fox, C. E., Lohman, D. F., & Snow, R. E. (1984). Adaptive reasoning: componential and eye movement analysis of geometric analogy performance. Intelligence, 8, 205 – 238. Carpenter, P. A., Just, M. A., & Shell, P. (1990). What one intelligence test measures: a theoretical account of the processing in the Raven Progressive Matrices test. Psychological Review, 97 (3), 404 – 431. Carroll, J. B. (1993a). Human cognitive abilities: a survey of factor-analytic studies. New York: Cambridge Univ. Press. Carroll, J. B. (1993b). Test theory and the behavioral scaling of test performance. In: N. Frederiksen, R. J. Mislevy, & I. I. Bejar (Eds.), Test theory for a new generation of tests ( pp. 297 – 322). Hillsdale, NJ: Lawrence Erlbaum Associates. Carroll, J. B. (1997). The three-stratum theory of cognitive abilities. In: D. P. Flanagan, J. L. Genshaft, & P. L. Harrison (Eds.), Contemporary intellectual assessment: theories, tests, and issues ( pp. 122 – 130). New York: Guilford Press. Cattell, R. B. (1941). Some theoretical issues in adult intelligence testing. Psychological Bulletin, 31, 161 – 179. Cohen, J. (1968). Multiple regression as a general data-analytic system. Psychological Bulletin, 70 (6), 426 – 443. Crinella, F. M., & Yu, J. (1999). Brain mechanisms and intelligence. Psychometric g and executive function. Intelligence, 27 (4), 299 – 327. Duncan, J., Emslie, H., & Williams, P. (1996). Intelligence and the frontal lobe: the organization of goal-directed behavior. Cognitive Psychology, 30, 257 – 303. Embretson, S. (1983). Construct validity: construct representation versus nomothetic span. Psychological Bulletin, 93 (1), 179 – 197. Embretson, S. (1985a). Studying intelligence with test theory models. In: D. K. Detterman (Ed.), Current topics in human intelligence, 1, ( pp. 3 – 17). Norwood, NJ: Ablex. S., Embretson (Ed.) (1985b). Test design: developments in psychology and psychometrics. Orlando: Academic Press. Embretson, S. (1994). Applications of cognitive design systems to test development. In: C. R. Reynolds (Ed.), Cognitive assessment: a multidisciplinary perspective ( pp. 107 – 135). New York: Plenum. Embretson, S. (1995). The role of working memory capacity and general control process in intelligence. Intelligence, 20, 169 – 189. Embretson, S. (1996). The new rules of measurement. Psychological Assessment, 8 (4), 341 – 349. Embretson, S. (1998). A cognitive design system approach to generating valid tests: application to abstract reasoning. Psychological Methods, 3 (3), 380 – 396. Engle, R. W., Tuholski, S. W., Laughlin, J. E., & Conway, A. R. A. (1999). Working memory, short-term memory, and general fluid intelligence: a latent-variable approach. Journal of Experimental Psychology, General, 128 (3), 309 – 331. Evans, T. G. (1968). Program for the solution of a class of geometric-analogy intelligent-test questions. In: M. Minsky (Ed.), Semantic information processing ( pp. 271 – 353). Cambridge, MA: MIT Press. Flynn, J. R. (1985). Massive IQ gains in 14 nations: what IQ tests really measure. Psychological Bulletin, 101 (2), 171 – 191. Flynn, J. R. (1998). IQ gains over time: toward finding the causes. In: U. Neisser (Ed.), The rising curve ( pp. 25 – 66). Washington, DC: American Psychological Association.

68

R. Primi / Intelligence 30 (2001) 41–70

Frederiksen, N., Mislevy, R. J., & Bejar, I. I. (1993). Test theory for a new generation of tests. Hillsdale, NJ: Lawrence Erlbaum Associates. Goldman, S. R., & Pellegrino, J. W. (1984). Deductions about induction: analyses of developmental and individual differences. In: R. J. Sternberg (Ed.), Advances in the psychology of human intelligence (vol. 2, pp. 149 – 197). Hillsdale, NJ: Lawrence Erlbaum Associates. Gonzales Labra, M. J. (1990). El nivel de abstraccio´n en las analogias geome´tricas (The level of abstraction of geometric analogies). Revista de Psicologia General y Aplicada, 43 (1), 23 – 32. Gonzales Labra, M. J., & Ballesteros Jimenez, S. (1993). Ana´lisis componencial de las analogı´as geome´tricas (Componential analysis of geometric analogies). Revista de Psicologia General y Aplicada, 46 (2), 139 – 147. Green, K. E., & Kluever, R. C. (1992). Components of item difficulty of Raven’s matrices. Journal of General Psychology, 119 (2), 189 – 199. Horn, J. L. (1986). Theory of fluid and crystallized intelligence. In: R. J. Sternberg (Ed.), Advances in the psychology of human intelligence (vol. 3, pp. 443 – 451). Hillsdale, NJ: Lawrence Erlbaum Associates. Horn, J. L. (1991). Measurement of intellectual capabilities: a review of theory. In: K. S. McGrew, J. K. Werder, & R. W. Woodcock (Eds.), WJ-R technical manual ( pp. 197 – 245). Allen, TX: DLM. Horn, J. L., & Cattell, R. B. (1966). Refinement and test of the theory of fluid and crystallized general intelligences. Journal of Educational Psychology, 57 (5), 253 – 270. Horn, J. L., & Noll, J. (1997). Human cognitive capabilities: Gf – Gc theory. In: D. P. Flanagan, J. L. Genshaft, & P. L. Harrison (Eds.), Contemporary intellectual assessment: theories, tests, and issues ( pp. 53 – 91). New York: Guilford Press. Hornke, L. F., & Habon, M. W. (1986). Rule-based item bank construction and evaluation within the linear logistic framework. Applied Psychological Measurement, 10 (4), 369 – 380. Howell, D. C. (1997). Statistical methods for psychology. Boston: Duxbury Press. Hunt, E. (1974). Quote the Raven? Nevermore! In: L. W. Gregg (Ed.), Knowledge and cognition ( pp. 129 – 158). Potomac, MD: Lawrence Erlbaum Associates. Hunt, E. (1996). Intelligence for the 21st century. Paper presented at the European Society for Cognitive Psychology and Spanish Society for the Study of Individual Differences, Madrid, Spain. Hunt, E. (1999). Intelligence and human resources: past, present and future. In: P. L. Ackerman, P. C. Kyllonen, & R. D. Roberts (Eds.), Learning and individual differences: process, trait and content determinants ( pp. 3 – 28). Washington, DC: American Psychological Association. Jacobs, P. I., & Vandeventer, M. (1972). Evaluating the teaching of intelligence. Educational and Psychological Measurement, 32, 235 – 248. Jurden, F. H. (1995). Individual differences in working memory and complex cognition. Journal of Educational Psychology, 87 (1), 93 – 102. Just, M. A., & Carpenter, P. A. (1992). A capacity theory of comprehension: individual differences in working memory. Psychological Review, 99 (1), 122 – 149. Klauer, K. J. (1990). A process theory of inductive reasoning tested by teaching of domain-specific thinking strategies. European Journal of Psychology of Education, 5 (2), 191 – 206. Kyllonen, P. C. (1994). CAM: a theoretical framework for cognitive abilities measurement. In: D. K. Detterman (Ed.), Current topics in human intelligence theories of intelligence (vol. 4, pp. 307 – 360). Norwood, NJ: Ablex. Kyllonen, P. C., & Christal, R. (1990). Reasoning ability is (little more than) working memory capacity?! Intelligence, 14, 389 – 434. Mack, A., Tang, B., Tuma, S., & Rock, I. (1992). Perceptual organization and attention. Cognitive Psychology, 24, 475 – 501. Maris, E. (1995). Psychometric latent response models. Psychometrika, 60 (4), 523 – 547. Marshalek, B., Lohman, D. F., & Snow, R. E. (1983). The complexity continuum in the radex and hierarchical models of intelligence. Intelligence, 7, 107 – 127. McGrew, K. S., Werder, J. K., & Woodcock, R. W. (1991). WJ-R technical manual. Allen, TX: DLM. Microsoft. (1995a). Microsoft Visual Basic version 4.0 — programmer’s guide. WA: Redmond. Microsoft. (1995b). Guide to data access objects. WA: Redmond.

R. Primi / Intelligence 30 (2001) 41–70

69

Mulholland, T. M., Pellegrino, J. W., & Glaser, R. (1980). Components of geometric analogy solution. Cognitive Psychology, 12, 252 – 284. Neisser, U. (1998). Introduction: rising test scores what they mean. In: U. Neisser (Ed.), The rising curve ( pp. 3 – 22). Washington, DC: American Psychological Association. Palmer, S. E. (1992). Common region: a new principle of perceptual grouping. Cognitive Psychology, 24, 346 – 447. Prabhakaran, V., Smith, J. A. L., Desmond, J. E., Glover, G. H., & Gabrieli, J. D. E. (1997). Neural substances of fluid reasoning: an fMRI study of neocortical activation during performance of the Raven’s Progressive Matrices test. Cognitive Psychology, 33, 43 – 63. Primi, R. (1995). Inteligeˆncia, processamento de informac˛a˜o e teoria da gestalt: um estudo experimental (Intelligence, information processing and gestalt theorie: an experimental study). Unpublished Master’s thesis, Catholic University of Campinas, Campinas. Primi, R., & Castilho, A. V. (1996). Processos cognitivos e dificuldade dos itens do teste Raven — um estudo baseado na IRT (Cognitive processes and complexity of Raven test items: a study based on Item Response Theory). In: Encontro de Te´cnicas do Exame Psicolo´gico: Ensino, Pesquisa e Aplicac¸˛o˜es, 2. Sa˜o PauloPrograma e Resumos, p. 8. Sa˜o Paulo: IP-USP. Primi, R., & Rosado, E. M. S. (1995). Os princı´pios de organizac˛a˜o perceptual e a atividade inteligente: um estudo sobre testes de inteligeˆncia (Principles of perceptual organization and intelligent mental activity: a study about intelligence tests). Estudos de Psicologia, 11 (2), 3 – 12. Primi, R., Rosado, E. M. S., & Almeida, L. S. (1995). Resoluc˛a˜o de tarefas de raciocı´nio analo´gico: contributos da teoria da gestalt a` compreensa˜o dos problemas subjacentes (Resolution of analogy reasoning tasks: gestalt contribution to the comprehension of basic cognitive components). In: L. S. Almeida, & I. S. Ribeiro (Eds.), Avaliac¸˛a˜o Psicolo´gica: Formas e Contextos (vol. 3, pp. 559 – 562). Braga: APPORT (Associac˛a˜o dos Psico´logos Portugueses). Raven, J., Raven, J. C., & Court, J. H. (1998). Manual for Raven’s progressive matrices and vocabulary scales: section 1. general overview. Oxford: Oxford Psychologists Press. Rock, I., & Palmer, S. (1990). The legacy of Gestalt psychology. Scientific American, 263, 48 – 61 (December). Rumelhart, D. E., & Abrahamson, A. A. (1973). A model for analogical reasoning. Cognitive Psychology, 5, 1 – 28. Salthouse, T. A. (1994). The aging of working memory. Neuropsychology, 8 (4), 535 – 543. Salthouse, T. A., Babcock, R. L., & Shaw, R. J. (1991). Effects of adult age on structural and operational capacities in working memory. Psychology and Aging, 6 (1), 118 – 127. Salthouse, T. A., Legg, S., Palmon, R., & Mitchell, D. (1990). Memory factors in age-related differences in simple reasoning. Psychology and Aging, 5 (1), 9 – 15. Snow, R. E., Kyllonen, P. C., & Marshalek, B. (1984). The topography of learning and ability correlations. In: R. J. Sternberg (Ed.), Advances in the psychology of human intelligence (vol. 2, pp. 47 – 103). Hillsdale, NJ: Lawrence Erlbaum Associates. Sternberg, R. J. (1977). A component process in analogical reasoning. Psychological Review, 84 (4), 353 – 378. Sternberg, R. J. (1978). Isolating the components of intelligence. Intelligence, 2, 117 – 128. Sternberg, R. J. (1980). Sketch of a componential subtheory of human intelligence. Behavioral and Brain Sciences, 3, 573 – 613. Sternberg, R. J. (1984). Toward a triarchic theory of human intelligence. Behavioral and Brain Sciences, 7, 269 – 315. Sternberg, R. J. (1986). Toward a unified theory of human reasoning. Intelligence, 10, 281 – 314. Sternberg, R. J. (1997). The triarchic theory of intelligence. In: D. P. Flanagan, J. L. Genshaft, & P. L. Harrison (Eds.), Contemporary intellectual assessment: theories, tests, and issues ( pp. 92 – 104). New York: Guilford Press. Sternberg, R. J., & Gardner, M. K. (1983). Unities in inductive reasoning. Journal of Experimental Psychology, General, 112, 80 – 116. Ward, J., & Fitzpatrick, T. F. (1973). Characteristics of matrices items. Perceptual and Motor Skills, 36, 987 – 993. Webster’s new collegiate dictionary, (1981). Springfield, MA: Merriam-Webster.

70

R. Primi / Intelligence 30 (2001) 41–70

Whitely, S. E. (1980a). Modeling aptitude test validity from cognitive components. Journal of Educational Psychology, 72 (6), 750 – 769. Whitely, S. E. (1980b). Multicomponent latent trait models for ability tests. Psychometrika, 45 (4), 479 – 494. Whitely, S. E. (1980c). Latent trait models in study of intelligence. Intelligence, 4, 97 – 132. Whitely, S. E., & Schneider, L. M. (1981). Information structure for geometric analogies: a test theory approach. Applied Psychological Measurement, 5 (3), 383 – 397. Woodcock, R. W. (1990). Theoretical foundations of the WJ-R measures of cognitive ability. Journal of Psychoeducational Assessment, 8, 231 – 258. Wright, B. D., & Stone, M. H. (1979). Best test design. Chicago: MESA.