CONTEMPORARY
EDUCATIONAL
13, 382-397 (1988)
PSYCHOLOGY
Training and Changes in Fluid and Crystallized Intelligence LAZARSTANKOVANDKUEI The University
CHEN
of Sydney
In this study, the same battery of tests was administered three times: at the beginning of the experiment, at the end of training, and 1 year after the end of training. The control group was taught in the traditional way. The experimental group was given extensive training in creative problem solving over a 3-year period. The invariant factor structure model for multimode data (R. P. McDonald, 1984, in Research methods of multimode data analysis, New York: Praeger Scientific) was fitted, with LISREL, to both groups simultaneously. At the last testing session, the experimental group performed better than the control group on a majority of variables. Fluid intellectual abilities were somewhat more affected by training in creative problem solving than were crystallized abilities. 0 1988 Academic press, Inc.
The theory of fluid (Gf) and crystallized (Gc) intelligence is perhaps the best known theory of individual differences in higher order human cognitive abilities today. Recent reviews of evidence supportive of this theory are found in the writings of Horn (1985, 1986). Structural aspects of this theory are revealed in hierarchical factor analyses of representative samples of cognitive tests administered to representative samples of subjects. Although under these conditions of research several other equally important broad factors always emerge (e.g., broad visualization and broad auditory function, broad factors associated with short-term and long-term memory, etc.) the theory is best known by its main intellective Constituents-Gf and Gc. The Gf/Gc distinction is supported by neurological studies showing divergent effects for brain damage and different ability correlates associated with cerebral lateralization (see, for example, Stankov, 1980, 1983) and by genetic work which shows that there are independent components for heritability of Gf and Gc. It is also supported by the developmental evidence which shows that Gf and Gc have different patterns of change This paper is based on the work of a former Professor of Educational Psychology at the University of Belgrade, the late Dr. Radivoy Kvashchev. Our own roles in the production of this paper are as follows. L. Stankov established the authenticity of the raw data, planned the overall strategy of the analyses, and wrote the final report. K. Chen carried out statistical analyses with COSAN and translated McDonald’s (1984) model into LISREL. We are grateful to the Institute of Psychology, Belgrade, for the permission to reanalyze Kvashchev’s data. Correspondence should be addressed to Lazar Stankov, Department of Psychology, The University of Sydney, Sydney, NSW 2006, Australia. 382 0361-476)(/88 $3.00 Copyright All rights
0 1988 by Academic Press. Inc. of reproduction in any form reserved.
FLUID
AND
CRYSTALLIZED
INTELLIGENCE
383
over the lifespan. Performance on both Gf and Gc increases until the age of about 20 and then Gf abilities show a decline, but Gc abilities remain either at the same level or improve slightly (see Horn, 1985, 1986; Stankov, 1988). The most important feature distinguishing between the Gf and Gc clusters and other broad abilities of this theory is the nature of cognitive processes which are thought to underlie them. Both Gf and Gc involve processes of reasoning, concept formation and attainment, problem solving, eduction of relations and correlates, etc. In short, they both call for processes typical of intelligence tests. Broad abilities of perceptual systems, of memory, and of speediness are usually not seen as proper measures of intelligence. The most important feature distinguishing between fluid and crystallized intelligence is the nature of (not the sheer presence of) the learning that contributes to the formation of the two abilities. Crystallized intelligence reflects largely acculturational learning, lessons organized in the culture to convert intellectual capacities into a form of intelligence deemed useful within a particular society. These learning experiences are accompanied by motivational systems of reward and punishment which help to enhance or exclude some classes of behavior from one’s repertoire of behavior. Gc reflects individual differences in these effects. Fluid intelligence, on the other hand, reflects largely idiosyncratic and casual learning which takes place outside organized educational and acculturational systems of influences. Many events involve incidental learning of various concepts and thinking skills related to understanding time, space, causality, etc. These events take place outside the school system. They provide the basis for Gf. Early writings of Cattell (see, for example, his 1971 book) were interpreted as showing that Gf is mostly genetically determined, whereas Gc reflects the “investment” of this genetic potential in the endeavors emphasized by out culture. Because of that, it was assumed that Gf is less amenable to changes based on learning than Gc. Recent work by Horn, however, has led to the characterization of Gf and Gc as in the preceding paragraphs. Given the evidence for the Gf and Gc clusters of abilities, it is pertinent to ask questions regarding the possibility of changing some aspects of intellectual capacities by means available in our educational institutions. Any study that shows differential effects on one cluster of abilities (say Gf) as opposed to the other abilities (say Gc) provides further evidence for the need to distinguish between the two. Although substantive issues regarding training effects are of primary concern in this paper, these issues will be addressed with the aid of some recently developed data-analytic methods. The present paper can also be seen as an attempt to study the feasibility of some new multivariate tech-
384
STANKOV
AND
CHEN
niques in experiments involving parallel groups design, Some univariate analyses of these data have been reported elsewhere (Stankov, 1986). The effects of training. Studies designed with a specific aim to assess the impact of educational intervention programs on separate Gf and Gc abilities are almost nonexistent. Studies of training effects on intelligence test performance have been reviewed by Caruso, Taylor, and Detterman (1982). They focused on a global measure of IQ rather than Gf and Gc. The subjects in these studies were either preschoolers, primary school children, mentally retarded children, or adults. Between 40 and 60% of the studies reported negative findings and most of the studies which obtained positive results showed a rather modest improvement of a few IQ points. More encouraging results have been obtained with the mentally retarded, in particular teaching self-monitoring skills for classes of intelligence test items (see, for example, Belmont, Butterfield, & Ferretti, 1982). By and large, there is scarce information regarding the longevity of the effects of these studies, and in many cases, the results cannot be generalized beyond a specific task. At the other end of the age scale Willis, Bliezner, and Baltes (1981) report some success in training retirement-age people (60 and over) to improve their performance on two Gf abilities (inductive reasoning and cognition of figural relations). The nature of training in this study. Although various studies reviewed by Caruso et al. (1982) employed many different procedures of training, it would be wrong to conclude that no training whatsoever could be effective. The present study contained several conditions which were not present in most other studies. In particular, the training was long (3 years), extensive (three to four sessions per week), and the size of the samples of subjects (live school classes in both experimental and control groups) was adequate to show effects. These conditions are infrequently found in the literature. Most important is the nature of the training exercises themselves. These exercises were designed to teach creative problem solving. Creative problem solving was defined in terms of the prevailing views about the nature and principles of creative thinking and numerous sets of exercises based on these principles were developed. Practice exercises were incorporated into the teaching of school material for various school subjects. Students were requested to write their answers in individual notebooks. Different proposed suggestions were discussed by the whole class and their merits and demerits were evaluated. A short list of five principles involved in some exercises is illustrative: 1. In one series of exercises, students were asked to design an experiment which could provide an answer to a particular problem. For example, after describing the anatomical organization of the brain, particularly its division into the left and right hemispheres, and the presence of a link
FLUID
AND
CRYSTALLIZED
INTELLIGENCE
385
(corpus callosum) between them, students were asked to design experiments to explore the roles of these two hemispheres in animals, in humans, in the nature of split personality, etc. 2. In another set of exercises, students were asked to think of problems which it may be possible to address with a given set of elements. For example, if you are given the following information: a. You have a miniature TV camera; b. you have a way of activating skin receptors; c. there is a one-to-one relationship between the skin receptors and TV image. Can you think of a device which could be constructed if all these three statements were true? What are the additional assumptions you have to make in order to construct such a device? What are the possible applications of such a device? 3. Students were taught how to overcome habitual approaches to problem solving, how to resist the influence of others on their thinking, how to develop new ways of problem solving. 4. They were taught, with appropriate examples, that it is necessary to explore and list all possible implications and search for hidden meanings in the data at hand. It was emphasized that different features of the objects may be relevant in different situations. 5. Students were taught to solve problems using the means-ends approach. Stankov (1986) gives further examples of the training exercises. Given the nature of the exercises, if the training is to be effective at all, it can be expected that both fluid and crystallized abilities will be affected. As Horn (e.g., 1975) argued, fluid intelligence would be affected because the training in the experimental group aims to develop what may be called “general cognitive schemas” for approaching problem situations. Moreover, these schemas are not now a part of the well-defined concepts or procedures of accumulated knowledge of the culture. Gc would be affected because the exercises take place within the school environment and they rely upon the school syllabus. Also, given the basic assumptions of the “investment theory” (Cattell, 1971), the development of fluid intelligence would facilitate the development of crystallized abilities. The relative sizes of these effects, however, cannot be predicted: this would depend on the interaction of treatment and subject factors. Multivariate context ofchange. It has become generally known during the past 15 years that, within the multivariate context, various parameters of the presumed behavioral model may change if one tests the same group of subjects with the same battery of tests on different occasions. These changes may take place with respect to the variables’ means, variances, and covariances. Additionally, if it is assumed that some particular factor structure underlies the interrelationship among the variables, various aspects of this structure may be assumed to change. With recent computa-
386
STANKOV
AND
CHEN
tional developments, these various assumptions may be subjected to statistical tests. The main advantage of multivariate over univariate procedures rests on the possibility of testing, in a single run, all the aspects of interest of the model being fitted to the data at hand. This is impossible to achieve with univariate procedures and, quite often, one has to make assumptions which could be unwarranted. For example, in Stankov’s (1986) study it was concluded that fluid intelligence was affected more than crystallized intelligence. This statement, at the very least, was contingent on the assumption that the factorial composition of the test battery does not change over the occasions of testing and that the same factor structure exists in both experimental and control groups. Without multivariate evidence of the kind we employ in the present paper, one has to consider Stankov’s (1986) conclusion as a conjecture rather than a statement of fact. Since the present paper is not intended to cover methodological aspects of the measurement of change in great detail, we shall not discuss these issues here. Rather, we shall build upon a recently developed model by McDonald (1984), and test whether various combinations of parameters for this model provide a satisfactory lit to our data. The model assumes an invariant factor structure, i.e., that the same factors exist on every occasion of testing. It also assumes that each variable has a baseline mean which allows for the arbitrariness of the origin of the measures. Furthermore, the model assumes correlations among the “specific parts” of a variable. McDonald’s (1984) model was developed for the case where the same sample of subjects is tested repeatedly with the same set of instruments, not for the case where two groups (i.e., experimental and control groups as is the case here) exist. It assumes that the major change takes place among the factor scores. In particular, it provides estimates of factor score means on different occasions and allows for the estimates of factor score intercovariances on each occasion. In this study, the COSAN (McDonald, 1978, 1980) program is used in order to test whether the model outlined above can be fitted within the experimental and control groups separately. After that, the model is translated into LISREL (Joreskog & Sorbom, 1981) and fitted to it simultaneously within both groups. This latter procedure allows us to constrain the parameters in one group to be equal to the corresponding parameters of the other group so that comparisons can be made with respect to parameters that are left free. This paper illustrates the method. To summarize, if there were to be any effects, both fluid and crystallized intelligence will be influenced by the training in creative problem solving. The main emphasis will be on the changes in factor scores and, in particular, changes in factor score means. Our aim is to discover
FLUID AND CRYSTALLIZED
INTELLIGENCE
387
whether performance on Gf and Gc improves at all. The procedure will also allow us to gain some idea of the relative sizes of these effects, i.e., whether it is Gf or Gc that is influenced more by training. METHOD
Subjects Participants in this study were all members of a generation that enrolled as freshmen in two small town high schools in Yugoslavia. They were 15 years of age at the commencement of their high school training. The main differences between the schools is in terms of the teaching staff. Every year these two particular high schools start five new first-year classes of approximately 30 students each. Classes themselves are formed on the basis of primary school marks with the objective of having about the same spread of ability in every class. From a total of 10 classes, a random choice of 5 classes formed the Experimental group for this study. The remaining 5 were declared the Control group. At the beginning, the Experimental group had 171 subjects and the Control group 156. Since the experiment lasted 4 school years, some attrition of the subject samples took place. As a result, a complete record is available for 149 members of the Experimental group and 147 members of the Control group. All analyses reported here are based on those subjects who had complete records available. In both groups, about half of the samples were males. In Yugoslavia, people are classified into three main socioeconomic status groups. The breakdown of the total sample in terms of their parents’ classification in these groups is as follows: (a) professionals and white-collar workers (N = 98); (b) blue-collar workers (N = 121); (c) farmers (N = 108). It is apparent that representation of the three groups in this sample does not differ very much and is typical of the total population of high school students in the country.
Procedure The design of the experiment is relatively simple, involving two parallel groups who were asked to complete a battery of paper-and-pencil tests four times. For 3 school years, the Experimental group was given exercises in creative problem solving, while the Control group continued their activities in a typical way. Both groups were tested shortly after their enrollment in the first grade and prior to being exposed to any exercises. This will be referred to as “initial testing.” They were given the same test battery at the end of the third grade, i.e., at the end of the experiment. This will be called “final testing.” In order to find out if the effects of training were long-lasting, two retests were given. The retests took place at the beginning and at the end of the fourth year, i.e., during the very last year of high school. The focus of this paper will be on three occasions of testing: initial, tinal, and 2nd retest. This choice was dictated by the need to keep the size of the matrices within the limits manageable by the available computer facilities. These occasions will make it possible to assess both the immediate effects of the experiment and also the longest lasting effects available in this study.
Test Battery Altogether 37 different cognitive tests were given to all of the subjects. Again, in order
388
STANKOV AND CHEN
to keep the present analyses within the manageable size, the analyses in this paper are based on a subset of 11 variables.’ Brief descriptions of the 11 variables used in this study are as follows (numbers in parentheses correspond to numbers in Stankov’s 1986 report): 1. (I) Essential Features test. In each item of this test subjects have to indicate what the essential features of an object are. Example: RIVER (2) _ salt, fish, gold, water, ship, bridge, channel. The number of parentheses indicates how many features from the list should be marked. Correct answer: water and channel. 2. (11) Word Classification “S” test. Underline the word that does not belong to a series of live words. 3. (12) Proverbs test. Each item in this test consists of four proverbs and twice as many possible interpretations of these proverbs. Subjects have to match these. 4. (13) Verbal Analogies test. 5. (14) Disarranged Sentences test. Carry out the operation indicated by a sentence with
words permuted (e.g., underline a particular word, etc.). 6. (23) Figure Classification test. A subtest of Cattell’s Culture Fair test of intelligence. 7. (24) Projections in Water test. A subtest of Cattell’s Culture Fair test of intelligence. 8. (2.5) Figure Series test. A subtest of Cattell’s Culture Fair test of intelligence. 9-l 1. (26-28). Three different versions of Cattell’s Matrices test. The above 11 tests are chosen as clear markers for crystallized intelligence (variables 1 to 5) and fluid intelligence (variables 6 to 11).
Statistical
Analyses
Three computer packages were used in the analyses of the data for this study. These are: the SPSS, Version 9 (Hull & Nie, 1981), COSAN (McDonald, 1978, 1980), LISREL V (Joreskog & Sorbom, 1981).
RESULTS Preliminary
Analyses
Univariate Analyses. Arithmetic means for both groups and for all three occasions of testing are presented in Table 1. In order to give some idea of the general trend present in the data, we display the results of univariate ANOVA analyses for each variable. The results of two-way ANOVAs with Occasions (Initial, Final, and 2nd retest) and Groups (Control and Experimental) factors are presented in Table 2. The “Occasions” effect is stronger than the other effects for all tests. The experiment lasted 4 years and the same battery of tests was given four times. Strong effects are undoubtedly due to both maturational and practice influences. Tests for the abilities of crystallized intelligence, in general, have larger F values than tests for the abilities of fluid intelligence. During this period of life, and under the usual high school studying i We plan to carry out further analyses similar to those reported here with another subset of variables used in Kvashchev’s experiment. It should be mentioned also that, due to the results of preliminary analyses, we had to discard 2 variables and carry out the main analyses of this paper with 9 rather than 11 variables.
FLUID
AND
CRYSTALLIZED
389
INTELLIGENCE
TABLE 1 ARITHMETIC MEANS FOR EXPERIMENTAL AND CONTROL GROUPS ON ALL THREE OCCASIONS OF TESTINGS Control group
Experimental group
Tests
Initial
Final
2nd Retest
Initial
Final
2nd Retest
1. Essential Features 2. Word Classification “S” 3. Proverbs 4. Verbal Analogies 5. Disarranged Sentences 6. Figure Classification 7. Projections in Water 8. Figure Series 9. Matrices I 10. Matrices II 11. Matrices III
30.76 14.29 9.50 19.51 10.42 6.62 1.22 9.95 8.25 9.23 5.46
34.70 17.77 13.39 23.01 13.52 8.61 7.69 11.40 8.95 9.85 6.80
37.49 18.59 16.22 24.13 16.84 9.37 8.12 12.31 9.34 9.99 7.51
28.41 15.46 9.46 20.06 10.11 6.40 7.00 9.42 7.85 8.24 5.46
36.24 18.05 14.19 23.70 14.27 9.38 8.00 12.04 9.18 10.07 8.81
39.42 20.82 17.81 26.11 17.38 10.89 8.43 13.38 9.56 10.50 8.90
y Variables 2 and 6 are excluded from the main analyses of this paper.
conditions, performance on tests of crystallized intelligence shows a more pronounced improvement than performance on fluid intelligence tasks. Significant Occasions by Groups interactions are due to the fact that, for the majority of the variables in this study, the Experimental group is slightly inferior to the Control group at the initial testing and superior at the end of the experiment. This kind of interaction effect, in conjunction TABLE 2 UNIVARIATE F VALUES FOR THE OCCASIONS (INITIAL, FINAL, END RETEST) BY GROUPS (CONTROL, EXPERIMENTAL) DESIGNS
Test Crystallized intelligence markers 1. Essential Features 2. Word Classification “S” 3. Proverbs 4. Verbal Analogies 5. Disarranged Sentences Fluid intelligence markers 6. Figure Classification 7. Projections in Water 8. Figure Series 9. Matrices I 10. Matrices II 11. Matrices III
Group@ (df = 1,294)
Occasions (df = 2,588)
Gp. by Occas. (df = 2,588)
1.05 26.02** 3.96* 5.15* 0.84
341.38** 421.58** 633.75** 306.85** 439.10**
17.23** 15.71** 7.75** 1.59 3.09*
5.92* 1.38 2.55 0.01 0.64 17.45**
406.81** 83.44** 244.25** 138.96** 146.52** 164.61**
11.07** 5.75+ 13.64** 8.35** 36.22** 11.51**
u Variables 2 and 6 are excluded from the main analyses of this paper. ’ Significant F values are marked by the asterisks (one asterisk indicates significance at the .05 level; two asterisks indicate significance at the .Ol level).
390
STANKOV
AND
CHEN
with a significant main effect for the Groups factor, indicates that the treatment, i.e., training in creative problem solving, was effective. Both Gf and Gc markers have about the same number of significant F values and the magnitudes of these values are about the same. Notice, however, that if we remove Tests 2 and 6, the interaction effects are more pronounced with the Gf markers. The remaining sections of this paper address these and related questions from the multivariate perspective. Fitting McDonald’s invariant factor structure model with COSAN. In order to obtain initial estimates for use in LISREL analyses, McDonald’s (1984) model was fitted separately within the control and experimental groups. The COSAN package was employed for these preliminary analyses. Our first attempt was to fit one (“general”) factor on every occasion. The obtained x2 values with 500 degrees of freedom were 981.56 for the Control group and 901.83 for the Experimental group. These x2 values indicate that a one-factor model does not fit. Analyses involving two factors, corresponding to the hypothesized structure for the 11 tests, produced x2 values of 842.23 in the Control group and 792.64 in the Experimental group with 494 degrees of freedom in each group. Neither one-factor nor two-factor solutions provide a satisfactory fit (the associated probabilities are smaller than .05), but the two-factor solutions are significantly better than the one-factor solution. Preliminary LZSREL analyses. Analyses to simultaneously fit a twofactor solution in both groups on the initial occasion (using LISREL), produced an acceptable fit. A three-factors solution does not significantly improve this lit. This evidence thus indicates a two-factor model for joint fitting in two samples. The coefficients generated by COSAN were the starting values for the LISREL analyses. The first LISREL run involving McDonald’s (1984) model produced an outcome in which the “total coefficient of determination for Y-variables” was negative in the control group. This appeared to be caused by the presence of two variables-tests 2 and 6-which had negative squared multiple correlations with the other tests of the battery. These two tests were therefore removed from the battery. The “total coefficient of determination for Y-variables” then was found to be positive. The main analyses were based on nine surviving variables on three occasions. The first four variables are the presumed markers for Gc and the last live variables are markers for Gf. Main Analyses
Starting matrices for these analyses consisted of the raw product-
FLUID AND CRYSTALLIZED
INTELLIGENCE
391
moment matrices for the Experimental and Control groups.* The results appear in Table 3. We shall now describe the meaning of the different matrices in this Table. A complete computer output for this solution can be obtained by writing to the senior author. Znvariuntfuctor pattern. The top part of Table 3 contains factor pattern matrices for the experimental and control groups. In both groups the same two factors (i.e., fluid and crystallized intelligence) are fitted for all three occasions. In this case factorial invariance exists between the occasions of testing, On the other hand, in the present analyses invariance across groups refers to the pattern of nonzero values, not to the requirement that the elements be exactly the same in both groups (see Horn, McArdle, & Mason, 1983). This latter case is a somewhat relaxed definition of invariance which provides for a better model fit. As mentioned earlier, the starting matrices contain elements which are not correlation coefficients. This means that factor pattern coefficients should be evaluated with respect to the variables’ standard deviations. They can be resealed to provide values familiar to typical users of factor analysis, i.e., regression coefficients which rarely exceed 1.00. Rather than resealing the obtained values, we reproduce them in the form generated by the computer. As a general guide for evaluating these coefficients, we can say that factor pattern values, in fact, would be .20 and above if starting values were correlation coefficients. The solution corresponds to the hypothesized factor pattern for the data at hand. A separate right-hand column, entitled the “Baseline test means,” represents the vector of arithmetic means estimated by the model for each variable. These means are produced by the program and they provide a fix on the origin of the scale of measurement. The obtained arithmetic mean for a given variable on a particular occasion (i.e., the “gain scores’ mean,” see Footnote 2) differs from its latent test mean due to the changes in factor scores’ means between the occasions. The solution in Table 3 contains a constraint needed in order to allow for comparison of factor score means between the two groups. In this solution it was stipulated that the latent test means be exactly the same in both experimental and control groups. The solution in this table produced a x2 value of 1128.52 with 697 degrees of freedom. This x2 corresponds to 2 The 34 by 34 input matrices (prior to the exclusion of tests 2 and 6) were generated in the following way. First, arithmetic means on each occasion were calculated and the first occasion means were subtracted from the means for the other two occasions. The obtained “ ‘gain scores’ means” were arranged into a 33 by 1 vector (for the first occasion, 11 zeros were entered). This vector was then postmultiplied by its transpose and the resulting 33 by 33 matrix was added to the 33 by 33 covariance matrix (11 variables on three occasions). This matrix was then augmented by a 1 by 34 vector containing 33 gain scores’ means as the first 33 elements and 1.00 as the 34th element.
392
STANKOV AND CHEN TABLE 3
INVARIANTFACTORSTRLJCTURE MODELFORPARALLELGROLJPS-LISREL SOLUTIONS Control Group Gc
Experimental Group
Gf
Gc
Gf
3.265 1.975 1.320 1.710 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.344 0.936 0.413 0.632 0.849
- 2.529 -.104 1.647
- 1.484 1.402 2.500
Baseline test mean
a. Invariant factor Pattern Testb 1. 3. 4. 5. 7. 8. 9. 10. 11.
Essential Features Proverbs Verbal Analogies Disarranged Sentences Projections in Water Figure Series Matrices I Matrices II Matrices III
1.299 0.986 0.921 0.937 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.358 1.027 0.418 0.382 0.884
6.501 5.210 5.113 4.582 0.584 1.638 0.763 0.600 1.424
b. Factor scores’ means Occasions Initial Final 2nd Retest
-5.200 - 1.181 1.561
-1.604 -.148 .695
c. Factor scores’ variances’ Occasions Initial Final 2nd Retest
7.516 6.210 5.178
3.288 2.398 2.211
2.595 2.352 .814
3.871 .979 .276
a This LISREL solution is not complete-missing are “specific” tests’ variances and covariances among the same tests on different occasions. b Variables 2 and 6 are not included in this analysis. ’ This model assumes that there are no covariations between factor scores. A model which assumes such covariations proved to be equally satisfactory.
a z value of 13.03. Separate analyses (not to be reported here), without the above constraint, were carried out. This solution had the same number of degrees of freedom (df = 697) but its z value was smaller: 10.19.3 In order to be able to evaluate the data reported here, the reader may want to know the following. First, with the given number of degrees of freedom, the acceptable solution would have to produce a x2 value of less than 800. As pointed out by Horn and McArdle (1980), this close correspondence between the number of degrees of freedom and the x2 value is very hard to achieve with real-life data. Second, in an extensive series of studies with Wechsler’s scales carried out by McArdle C!?L Horn (see Horn, 3 Formulas for transforming x2 values with more than 30 degrees of freedom into the z values are provided in some statistical textbooks (see Kirk, 1978).
FLUID AND CRYSTALLIZED
INTELLIGENCE
393
1985), the preferred solution had a z value of about 10 which differs rather little from what we have here. It can be claimed that the reported solution is about as good as could realistically be obtained. Summary statistics regarding factor scores. The lower part of Table 3 contains factor score means and variances as estimated by LISREL. These are of interest to us in this paper since our aim is to find out if there is an increase in factor score means across occasions. A successful training program should produce an increase in the overall level of performance. We also want to find out if the improvement in the Experimental group is greater than the improvement in the Control group for Gf or Gc factors. A positive outcome would provide further support for the distinction between fluid and crystallized intelligence. There are various approaches which could be used for these purposes, but, since rules for hypothesis testing with structural modelling procedures do not exist, we decided not to employ statistical tests in this case.4 However, it can be seen in Table 3 that all factor score means increase over the occasions of testing. In this context, the effect of training was to spread the between-occasions variance on Gf and to narrow the betweenoccasions variance on Gc. That is consistent with the claim that the school curriculum on which the exercises were based leads to a certain uniformity in educationally based components of intelligence. Training in creative problem solving, on the other hand, leads to a development of idiosyncratic and incidental-learning skills which are embodied in our idea of fluid abilities. Comparison of the “Initial” and the “2nd Retest” only, shows that the greatest increase is on the crystallized intelligence factor and the smallest increase is on the fluid intelligence factor for the control group. Similar conclusions would follow from comparing the “Initial” and “Final” occasions. These differences are provided in Table 4. To evaluate the results, we need to consider the factor score variances as well. Variances also differ not only with respect to factors and groups, but also with respect to occasions of testing. In particular, the variances on the last occasion in the Experimental group are considerably smaller than all the other variances considered here. This was also observed by Kvaschev (1980) and Stankov (1986). Under these conditions, one may use either a 4 Since the main problem here derives from the inequality of factor scores’ variances and differences in factor scores’ means on the initial occasion, it would be possible to constrain them to be equal across the groups and/or factors. Once this is accomplished, it may be possible to constrain factor scores’ means in the Experimental group to be equal to factor scores’ means in the Control group and observe changes in the x2 value under these different conditions. It is certain, however, that all these constraints would provide an extremely poor fit of the model and, in the absence of clear-cut rules for carrying out these types of analyses, we feel that it is preferable to employ a more impressionistic approach at this stage.
394
STANKOV
AND
CHEN
TABLE 4 CHANGESIN PERFORMANCEON FLUIDAND~RYSTALLIZED
INTELLIGENCE FACTORS
Control Gf
Gc
Gf
6.761
2.299
4.176
3.984
2.741 2.467
1.813 1.268
1.611 2.592
1.967 2.025
.382
.233
.178
.211
.239 10.890
.155 5.925
.104 14.812
.173 10.375
GC
Differences between the “Initial” and “2nd Retest” factor scores’ means (6) Standard deviations from the “Initial” occasion (s) Standardized differences (d/s) Standard errors of the factor scores’ means on the “Initial” occasion (Z) Standard error of the factor scores’ means on the “2nd Retest” occasion (R) z tests (a’ divided by the sum of I and R)
Experimental
pooled variance or the variance on the “Initial” occasion as a standard in order to evaluate the differences between the factor score means. Since the main conclusions do not differ, we present the results with respect to the “Initial” occasion only-a rather cautious approach. The second row of Table 4 contains standard deviations for the factor scores on the “Initial” occasion and the last row displays the ratios of the “Initial vs 2nd Retest” differences and these standard deviations. It can be seen from Table 4 that the ordering of the standardized differences is not the same as the ordering of the simple differences. It is apparent that, on the crystallized intelligence factor, the Experimental group has not achieved much more than the Control group (2.592 vs 2.467). With fluid intelligence, the situation seems different-the improvement of the Experimental group is greater than the Control group (2.025 vs 1.268). An alternative procedure involves the use of standard errors produced by the LISREL program for parameters of the model presented in Table 3. The lower part of Table 4 contains standard errors associated with the “Initial” and the “2nd Retest” factor score means. After making an assumption that these two occasions are uncorrelated, we can simply add the two standard errors and divide the elements of the first row of Table 4 with the resulting sum. This produces an (conservative) estimate of the z test for comparing factor scores’ means on these two occasions. The resulting z tests are also shown in the very last row of Table 4. It is apparent that the experimental group’s increase on the Gf factor is greater (5.925 vs 10.735) than its increase on the Gc factor (10.890 vs 14.812). Thus, it appears that the training in creative problem solving has af-
FLUID
AND
CRYSTALLIZED
INTELLIGENCE
395
fected fluid intelligence somewhat more than crystallized intelligence. This conclusion is in agreement with the claims made by Stankov (1986) on the basis of different (univariate) analyses from the ones presented in this paper. It is also in agreement with the trend present in Table 2 (i.e., fluid intelligence tests show a stronger Group by Occasions interaction). The other finding of Table 2 which is supported by the data of Table 4 is the greater “Occasions” effect for crystallized intelligence. Thus, within each group, the standardized difference is greater for crystallized intelligence than it is for fluid intelligence (2.592 vs 2.025 for the experimental and 2.467 vs 1.268 for the control group). This means that crystallized intelligence changed more during the 4-year period, but, as we have seen above, the training itself affects fluid intelligence somewhat more. To complete the model, we should also report the covariances between the factors on each occasion, the specific variances for each variable, and indeed, the covariances between the specific parts of the same test on different occasions. These were also estimated as free parameters in LISREL. They are not presented here in order to save space. DISCUSSION The findings of the present study are of importance both to substantive theories about the structure of human cognitive abilities, and to those interested in recent data-analytic developments. Theories of intelligence have to acknowledge the fact that intensive and prolonged training in creative problem solving can improve performance on intelligence tests. This improvement is due to exercises designed to encourage the development of general cognitive schemas rather than practice in working through the intelligence test items. In other words, far-transfer rather than near-transfer was emphasized in the training exercises. The achieved improvement was statistically significant as evidenced by the univariate analysis of variance. Multivariate analysis did not allow for a test of significance but the results, again, suggest a noteworthy improvement in performance. This substantive conclusion needs to be carefully qualified because of the findings reported by Stankov (1986). He obtained L scores for 28 individual tests of the Kvashchev’s (1980) study, using as we do in this paper, the “Initial” standard deviations as denominators in the z scores formula. After transforming these scores into their typical IQ scale equivalents (i.e., standard deviation = 15), the improvement for the various tests ranged between 5 and 8 IQ points. This is not a massive improvement considering the amount of training involved but, overall, it does encourage an optimistic view for the role of educational intervention. Given the nature of training in Kvaschev’s experiment, it is perhaps necessary to emphasize that our results point to the importance of diver-
396
STANKOV
AND
CHEN
gent thinking, of flexibility, and of the need to approach cognitive tasks in a number of different, unusual, and creative ways. In this sense, the results are opposed to suggestions often associated with the back-to-three R’s movement, which seem to discourage that way of thinking. Another substantive theoretical issue that is addressed with the present data involves a comparison of the relative effects of training on processes tapped by fluid and crystallized intelligence. Since crystallized intelligence has been associated with the outcomes of formal education and acculturation, the present result may appear surprizing. We emphasize, however, that this result is still open to question. There are two reasons for our caution. First, the difference in improvement between fluid and crystallized intelligence factors is very small indeed. Second, when we resealed our raw test scores into “percentage correct” and carried out analyses similar to those in the present paper, the outcome was slightly in favor of the crystallized intelligence factor. These two facts suggest that a relatively small change in the sampling of subjects and variables may alter this particular conclusion of the present paper. It is necessary to mention that our finding of greater improvement in fluid intelligence does not contradict basic tenets of the theory of fluid and crystallized intelligence since, as emphasized in the introduction, fluid intelligence can be seen as an expression of incidental and casual learning. Our results have methodological significance as well. It is obviously feasible to use the procedures developed in the area on linear equations modeling and analysis of covariance structures in experiments involving parallel groups design. The major advantage of the procedure employed here over the univariate approaches lies in the fact that it does not discard information regarding the interrelationship among different variables within and between occasions. It also provides for a simultaneous test of whether the two-factors solution fits the data adequately and, in principle, of the differences between the factor scores’ means in Experimental and Control groups. In this regard it is necessary to note that, although we have emphasized changes in terms of factor scores, the procedure is quite flexible and, in fact, allows for testing various other possibilities, e.g., whether factors appear or disappear on different occasions. It will be important, however, to develop a detailed set of rules for employing this method in a logically defensible way. It will also be important to develop statistical machinery which will allow us to employ principles analogous to hypothesis testing in multivariate studies. REFERENCES BELMONT, J. M., BUTTERFIELD, E. C., & FERRETTI, R. P. (1982). To secure transfer of training instruct self-management skills. In D. K. Detterman & R. J. Stemberg (Eds.), How and how much can intelligence be increased? Norwood, NJ: Ablex Publishing.
FLUID
AND
CRYSTALLIZED
INTELLIGENCE
397
CARUSO, D. R., TAYLOR, J. J., & DETTERMAN, D. K. (1982). Intelligence research and intelligent policy. In D. K. Detterman & R. J. Stemberg (Eds.), How and how much can intelligence be increased? Norwood, NJ: Ablex Publishing. CATTELL, R. B. (1971). Abilities: Their structure, growth and action. Boston: Houghton Mifflin. DETTERMAN,D. K., & STERNBERG,R. J., Eds. (1982). How and how much can inrelligence be increased. Norwood, NJ: Ablex Publishing. HORN, J. L. (1975). Psychometric studies of aging and intelligence. In S. Gershon and A. Raskin (Eds.), Aging: Vol. 2. Genesis and treatment of psychologic disorders in the elderly. New York: Raven Press. HORN, J. L. (1981). Concepts of intellect in relation to learning and adult development. Intelligence,
4, 285-317.
HORN, J. L. (1985). Remodeling old models of intelligence. In Wolman (Ed.), Handbook of intelligence. New York: Wiley. HORN, J. L. (1986). Models of intelligence. In L. G. Humphreys (Ed.), Intelligence: Mensurement, theory and public policy. Urbana, IL: Univ. of Illinois Press. HORN, J. L., & MCARDLE, J. J. (1980). Perspectives on mathematical/statistical model building (MASMOB) in research on aging. In L. Poon (Ed.), Aging in the 1980’s. Washington, DC: American Psychological Association. HORN, J. L., MCARDLE, J. J. & MASON, R. C. (1983). When is invariance not invariant: A practical scientist’s view of the ethereal concept of factorial invariance. The Southern Psychologist, 1, 179-188. HULL, C. H., & NIE, N. H. (1981). SPSS, Update 7-9. New York: McGraw-Hill. JORESKOG,K. G., & SORBOM,D. (1981). LZSREL V: User’s guide. Chicago, IL: Intemational Educational Resources, Inc. KIRK, R. E. (1978). Introductory statistics. Belmont, CA: Wadsworth. KVASCHEV,R. (1980). Mogucnosti i granite razvoju inteligencije. [The Feasibility and Limits of Intelligence Training]. Belgrade: Nolit. MCDONALD, R. (1978). A simple comprehensive model for the analysis of covariance structures. British Journal of Mathematical and Statistical Psychology, 31, 59-72. MCDONALD, R. (1980). A simple comprehensive model for the analysis of covariance structures: Some remarks on applications. British Journal of Mathematical and Statistical Psychology, 33, 161-183. MCDONALD, R. P. (1984). The invariant factors model for multimode data. In H. G. Law, C. W. Snyder, Jr., J. A. Hattie, and R. P. McDonald (Eds.) Research methods for multimode data analysis. New York: Praeger Scientific. STANKOV, L. (1980). Ear differences and implied cerebral lateralization on some intellective auditory factors. Applied Psychological Measurement, 4,(l), 21-38. STANKOV, L. (1983). The role of competition in human abilities revealed through auditory tests. Multivariate Behavioral Research Monographs, No. 83-1, pp. 63 & VII. STANKOV, L. (1986). Kvashchev’s experiment: Can we boost intelligence? Intelligence. 103, 209-230. STANKOV, L. (1988). Aging, intelligence and attention. Psychology and Aging, 3,(2), 59-74. WILLIS, S. L., BLIEZNER, R., & BALTES, P. B. (1981). Intellectual training research in aging: Modification on performance on the fluid ability to figural relations. Journal of Educational Psychology, 73, 41-50.