The effectiveness of the cognitive training for children from a differential perspective: a meta-evaluation

The effectiveness of the cognitive training for children from a differential perspective: a meta-evaluation

Learning and Instruction. Vol. 8, No. 5, pp. 411–438, 1998  1998 Elsevier Science Ltd. All rights reserved Printed in Great Britain 0959-4752/98 $19...

124KB Sizes 2 Downloads 61 Views

Learning and Instruction. Vol. 8, No. 5, pp. 411–438, 1998  1998 Elsevier Science Ltd. All rights reserved Printed in Great Britain 0959-4752/98 $19.00 + 0.00

Pergamon

PII:S0959-4752(98)00003-6

THE EFFECTIVENESS OF THE COGNITIVE TRAINING FOR CHILDREN FROM A DIFFERENTIAL PERSPECTIVE: A METAEVALUATION WILLI HAGER† AND MARCUS HASSELHORN*‡ †Universita¨t Go¨ttingen, Institut fu¨r Psychologie, Gosslerstr. 14, D37073 Go¨ttingen, Germany ‡Universita¨t Go¨ttingen, Institut fu¨r Psychologie, Abt. 4: Pa¨dagogische Psychologie, Waldweg 26, D-37073 Go¨ttingen, Germany Abstract The Cognitive Training for Children (CTC) was designed by Klauer to teach children thinking strategies that would increase their inductive reasoning. This article reviews seventeen studies designed to evaluate the effectiveness of the CTC. In contrast to popular meta-analysis with its statistical techniques of averaging effect sizes, a qualitative meta-evaluation based on quantitative findings was undertaken from a differential perspective. This approach reveals some interesting information concerning the why and how of the CTC’s overall effectiveness. Problems concerning the appropriateness of the criterion variables used as well as the control groups included in the various studies were considered thoroughly. A discussion of the program author’s claim of domain specificity of the effectiveness of the CTC and of the present authors’ perception hypothesis are provided. Despite the unambiguous merits of the CTC, it is concluded that primarily changes in efficiency of visual perception are responsible for the effectiveness of the program among younger children, and that it is not worse, but also not superior to available rival cognitive training programs.  1998 Elsevier Science Ltd. All rights reserved.

Since the beginning of the late eighties, various programs to foster children’s cognitive abilities have been developed and published in Europe, especially in Germany. One of the most inspiring and ambitious of these programs is the Cognitive Training for Children which has been designed to enhance children’s inductive reasoning ability. It was originally introduced in the German language by Klauer (1989a), and by now, it is also available in English (see Klauer & Phye, 1994) and Dutch versions (Klauer, Resing, & Slenders, 1995). The Cognitive Training for Children (CTC) has been given a great deal of attention, *Address for correspondence: Universita¨t Go¨ttingen, Institut fu¨r Psychologie, Abt. 4: Pa¨dagogische Psychologie, Waldweg 26, D-37073 Go¨ttingen, Germany. 411

412

W. HAGER, M. HASSELHORN

and its effectiveness has been evaluated in a number of studies by the author’s research group (internal evaluations) and in several studies by other authors (external evaluations according to the distinction made by Scriven, 1991). With a few exceptions, the majority of studies leave little doubt that the program is effective, if effectiveness is assessed by conventional tests of general fluid intelligence. There is a continuing debate, however, which kind of particular effects the program elicits that are responsible for the observed increases in intelligence test performance. In this context, problems concerning the appropriateness of the criterion variables used to assess the goals of the CTC as well as the adequateness of comparison groups used to evaluate the effectiveness of the CTC are issues relating to the debate. In the present paper, a meta-evaluation (Cook & Gruder, 1978; Scriven, 1991) of the studies evaluating the effectiveness of the German version is presented to provide a thorough base for estimating the range of the program’s benefits. After a short characterization of the Cognitive Training for Children, we will present some important distinctions concerning different kinds of empirical evaluations, and we will outline the above mentioned problems regarding criterion variables and comparison groups used in prevailing evaluations of the CTC. Based on this general frame, our meta-evaluation illuminates the program’s objectives, its transfer, and the duration of its effects. The Cognitive Training for Children (CTC) The Cognitive Training for Children (CTC) has been developed for the proximal goal of achieving higher levels of inductive reasoning ability. Although CTC versions for older children are available, the present meta-evaluation will focus on the CTC for children between the ages of five and seven (six through eight; Klauer, 1993). In addition, it should increase children’s ability to solve analytical problems, which is considered to be of fundamental importance during the whole life span (Klauer, 1989a, p. 3). Klauer (1996, p. 38) states that “inductive reasoning consists in finding out regularities and irregularities by detecting similarities and/or differences of attributes and/or relations with respect to objects or n-tuples of objects”. This is taught by means of 120 tasks representing six basic types of information processing procedures: generalization, discrimination, cross classification, recognizing relations, differentiating relations, and system construction (see Klauer & Phye, 1994, pp. 9–15). For each type of processing to be trained, the complexity of the presented material increases from concrete objects over pictures to abstract symbols. The concrete objects are wooden bricks which, however, are not delivered with the program. The CTC can be used to train single children or to train children in small groups; training of larger groups is held possible (Klauer, 1989a), but has not yet been evaluated empirically. The author proposes verbal self-instruction, guided discovery, and self-reflection as instructional methods. The training usually is distributed over 10 sessions of about 20 minutes each (Klauer, 1989a), although there are some exceptions to this (see below). This time of about 200 minutes for most children is too short to be able to work on all 120 tasks, but nearly all empirical evaluations refer to this time schedule. In the psychometric tradition, inductive reasoning is viewed as a component of general fluid intelligence gf (e.g. Cattell, 1963; Horn, 1985), and Klauer and his colleagues (Klauer, 1989a, Klauer, 1996; Klauer & Phye, 1994) expect certain transfer effects of the CTC to specific other components of fluid ability, but neither to crystallized abilities (gc) nor to other areas of intelligence (claim of domain specificity of the effects).

COGNITIVE TRAINING FOR CHILDREN

413

Scope and Methodology of the Present Meta-Evaluation Effect Sizes and Tests of Significance During the last couple of years various internal meta-analyses of the Cognitive Training for Children have been provided (see, Klauer, 1990b, Klauer, 1991, Klauer, 1993, Klauer, 1994, Klauer, 1995; Klauer & Phye, 1994). In these meta-analyses, effect sizes of approximately half a standard deviation (d = 0.50) averaged over different types of studies, settings, children of different levels of age and ability, and various criterion variables have been reported, and it has been concluded that the CTC is effective overall. We do not doubt the validity of the internal meta-analyses nor the conclusion concerning the overall effectiveness of the CTC. However, although accumulating and averaging available data is a valuable approach, “knowledge of average effects says nothing about when, where, why, and how a program works” (Shadish & Sweeney, 1991, p. 883). Since the potential user will mainly be interested in questions concerning the when, where, why, and how of this effectiveness, we will present a number of differential analyses concerning the empirical basis for CTC’s effectiveness. In doing so, we will pay special attention to two principal problems we were faced with when analyzing prevailing internal empirical evaluations and meta-analyses in some detail. The first problem has to do with the appropriateness of the criterion variables used to assess the effectiveness of the program. The second problem concerns the issue of the comparison groups used to evaluate the program’s effectiveness. Before considering both problems in more detail, some remarks about the statistical analysis of the available evaluation studies and the estimation of effect sizes seem necessary. Usually, the evaluations of the CTC have been based on a pretest–posttest design, which sometimes is supplemented by a follow-up several months after finishing the program. The program author prefers analyses of covariance to compare his experimental groups, whereas other authors use either planned contrasts of means adjusted by analysis of covariance (ANCOVA) or of means in a repeated measures analysis of variance (RM-ANOVA) or conventional repeated measures analyses of variance including randomized block designs. A comparison made by the first author (Hager, 1996) between interaction F tests of RM-ANOVA and F-tests of ANCOVA for over one hundred different data sets did not reveal any statistical advantage of one method of analysis over the other, with only one case where ANCOVA led to a significant result, whereas the interaction F test of RM-ANOVA remained insignificant. This means that the results of statistical analyses will be likely to be independent from whether an ANCOVA or an RM-ANOVA is used, whereas planned contrasts usually provide more powerful tests than the F test of the aforementioned procedures, followed by two-sided post-hoc tests. However, the residual scores of analysis of covariance have come under severe attack partly because of their lack of interpretability, whereas the difference score underlying repeated measurement analysis of variance, which has been repeatedly criticized for about a quarter of a century, constitutes a reliable measure that is easily interpretable; “intuitively appealing, easy to compute”, and — above all — it “is an unbiased estimator of the underlying true change” (Willett, 1990, p. 634). For this reason, most external evaluation studies of the CTC rely on RMANOVA. An exception are the studies presented by Angerhoefer, Kullik, & Masendorf (1992) and by Masendorf (1994), where χ2-tests on the individual residual change scores

414

W. HAGER, M. HASSELHORN

were performed, using a single case approach instead of the usual comparisons of various groups. Given a standard pretest–posttest design and these methods of analysis, there are various possibilities to calculate effect sizes, usually defined as a standardized distance between two or four means (Mjk). To attain maximal comparability between studies we re-calculated the values of all important comparisons, wherever possible. The main sources of our recalculations were the original data,* and the following equation was used to express the size of the statistical effect dI for the interaction contrast used for the comparisons between the programs: dI = [M(post)TG − M(post)CG − M(pre)TG + M(pre)CG]/sI, where: M = arithmetic mean of dependent variable; pre = pretest; post = posttest; TG = training group; CG = comparison group (control group, no treatment group, or rival training; see below); sI = within standard deviation averaged over all four treatment conditions. The choice of the mean standard deviation sI reflects the assumption of homogeneous variances within each experimental condition in ANOVA or ANCOVA. Its use enables better comparisons with the effect sizes computed in designs without repeated measures. In some cases we have to report the effect sizes dc computed by the authors, which is the difference of the standardized differences between the posttest measures and the pretest measures standardized by the standard deviation in the comparison group; this measure has most often been used in existing internal evaluations of the CTC (see Klauer & Phye, 1994, p. 62, for details of its computation). Statistical significance shows whether there is an effect or not. Additional computations of the effect sizes then show how large this effect is. Tests of significance and effect sizes are viewed as complementary pieces of information, not as alternatives, although other views are possible (for a detailed exposition of this issue, see Hager, 1995; Chow, 1996). Therefore, we only report effect sizes for statistically significant results — in contrast to many meta-analytic procedures applied by the program author, who computed effect sizes irrespective of the fact whether a particular result was statistically significant or not. Various proposals can be found in the literature (cf. Bloom, 1984; Hager, 1995; Friedrich & Mandl, 1992) as to which (statistical) effect may be called large (or of sufficient size) and which may be called small. But there is no general rule for answering this question, since the size of an effect depends on the sample at hand, the program considered, and its concrete implementation (program integrity), the criterion variables used, the abilities of the children trained, and — among other factors — on considerable random fluctuations. In addition, it can be discussed whether it is reasonable to accept rather small effects which, however, show up reliably, or to accept only rather large effects that show up under some conditions, but not under others, that is, which are not very reliable. In our opinion, questions concerning the actual sizes of (statistical) effects can only be answered during or after a complete evaluation research program. One should avoid referring to conventions like those proposed by Cohen (1988) for cases where no other criteria are available. For example, the statement that a large effect according to Cohen has been detected may hide the fact that the respective criterion variable is very similar to the tasks *We gratefully acknowledge the cooperation of the program author, Professor K. J. Klauer, who provided us with the original data of most evaluations of his own research group.

COGNITIVE TRAINING FOR CHILDREN

415

of the training program, in which case large effects immediately after completion of the program seem to be trivial. On the other hand, detection of a small, but reliable effect may be far more valuable if it refers to a criterion variable quite dissimilar to the program’s problems and which involves some kind of transfer. For this reason, we only report the sizes of the effects without classifying their magnitudes. The Problem of the Appropriateness of the Criterion Variables To evaluate the effectiveness of the program with respect to its primary goal of improving inductive reasoning, criterion variables have to be considered which are valid measures of children’s inductive reasoning ability. However, there is no test available in German language which has been developed specifically to assess this particular ability and which explicitly asks the children for detecting or finding regularities and rules (see, e.g., Bu¨chel, 1992). Possibly for this reason, in most of the evaluation studies concerning the CTC traditional intelligence tests, or subtests of these, have been used to assess the effects of the program. For example, Raven’s Coloured Progressive Matrices (CPM) (Raven, 1973; German version by Becker, Schaller, & Schmidtke, 1980) have been used in several studies. Although Raven intended this test to assess general fluid intelligence, the authors of the German version (Becker et al., 1980) point out that the CPM also require perceptual abilities and inductive-analogous reasoning abilities to different degrees for different ages. Many of the tasks, especially of the CPM, have been shown to be successfully solvable by perceptual analyses alone; and only some tasks necessarily require higher order reasoning or conceptual processes such as inductive reasoning (see, e.g., Becker et al., 1980, pp. 16–18; cf. also Hunt, 1974). Accordingly, any improvements in tests like Raven’s CPM may be due to improvements of inductive reasoning ability, but they may also be due to improvements of other components of fluid intelligence such as perceptual speed or memory components. This particular part–whole relation between inductive reasoning and general fluid intelligence is more or less a problem of all the criterion variables used to evaluate the effectiveness of the CTC: If inductive reasoning is trained successfully, general fluid intelligence gf is also enhanced, but enhancement of gf does not necessarily imply that inductive reasoning has been trained successfully. This ambiguity makes it difficult to decide whether training-specific increases in test performances can be attributed to successful training of inductive reasoning (primary goal of the CTC) or to an enhancement of general fluid intelligence (alleged domain-specific transfer of the CTC) (for a further discussion of this problem, see Hager, Hasselhorn, & Hu¨ber, 1995). The Issue of the Comparison Groups Used Scriven (1967, 1991)’s distinction between non-comparative and comparative evaluations is useful to outline the second problem we are faced with in most of the internal evaluations of the CTC. A non-comparative evaluation is directed to scrutinize the program’s effectiveness as such. That is, the program is compared either to an untrained comparison group (waiting group) or to a trained control group whose children are trained with another program that has objectives quite different from those of the program to be evaluated (see below). As the CTC focuses on inductive reasoning, a proper control program may aim at fostering other kinds of abilities, e.g. children’s social competence or

416

W. HAGER, M. HASSELHORN

their phonological awareness. However, the conditions of its implementation must be very similar to the conditions of the CTC: The control program must be of equal length, it must be administered to single children or to small groups of children in the same way as is chosen for the CTC, it must be equally attractive, equally demanding, and so on. In most internal evaluations of the CTC, the effectiveness of the program has been evaluated against performances in a no-treatment or waiting group, even if a second trained group had been incorporated into the design. No-treatment groups have also been used in (at least) two external evaluations (Beck, Lu¨ttmann, & Rogalla, 1993; Beck, Lu¨bking, & Meier, 1995). From our point of view, this strategy of comparison is associated with several weaknesses, since the groups do not only differ with respect to the program to be evaluated, but also with respect to the important fact that a special treatment was given to the children who usually had been taken from their regular school or kindergarten activities to take part in the program (see Hager, 1995; Hager & Hasselhorn, 1995b, and Sternberg & Bhana, 1986, for further discussions of this issue). All evaluations using either a no-treatment or a control group are non-comparative (Hager, 1995; Scriven, 1991), and they are directed at assessing the program’s effectiveness, that is, whether it is effective or not. In comparative evaluations (Hager, 1995; Scriven, 1991), the CTC is compared to a second program which also has been claimed to enhance inductive reasoning. This rival or competitive program may use different tasks, problems, strategies, and/or instructional methods, and despite the various differences, it is most important that both programs are administered in the way the program authors outline in their manuals. Comparative evaluations aim at clarifying which of two programs with the same goals is the better one. Thus, in the present meta-evaluation the studies are classified according to the types of evaluation and to the criterion variables used. Special attention will be given to the goals of the investigations and their hypotheses, to the comparison groups (no-treatment group, control training, or rival training), to the characteristics of the trained children, and to the settings of the training (e.g., individual training or small groups). Moreover, we will address the questions of domain specificity of effects and of their persistence or duration. The theoretical background for this particular kind of meta-evaluation has been outlined in Hager (1996) and in Hager, Elsner, & Hu¨bner (1995).

Global Evaluations of the Cognitive Training for Children In autumn 1996, 17 studies evaluating the effectiveness of the original German CTC version for younger children were available and they will be considered in this metaevaluation. The main characteristics of all 17 studies are presented in Tables 1,2. We shall not consider the studies by Tomic (1995) and by Tomic & Klauer (1996) in our metaevaluation, since in those studies either a shortened Dutch version of the program was administered (Tomic, 1995; Tomic & Klauer, 1996, study 1), or several combinations of the CTC with a perceptual program were investigated (Tomic & Klauer, 1996, study 2). In addition to Raven’s Progressive Matrices, two types of intelligence tests were mainly used in the considered studies: the Kognitiver Fa¨higkeitstest (KFT) and the Grundintelligenztest of the CFT-series. The KFT-K and the KFT 1–3 by Heller & Geisler (1983a) Heller & Geisler (1983b) are German versions of the Cognitive Abilities Test (CAT) by

Subjects

CTC

Control training

(a) Non-comparative evaluations of the Cognitive Training for Children (CTC) Bornemann (1988a); N = 27; 5–6 yrs. n = 14; 15 sess. (20 min); — in Klauer (1989a) pairs of children Bornemann (1988b); N = 20; 5–6 yrs. n = 10; 10 sess. (20 min); — in Klauer (1989a) pairs of children Beck, Lu¨ttmann, & N = 140; 5–6;9 yrs. n = 72; 10 sess. (20 min); — Rogalla (1993) individual training Alizadeh, Becker, & N = 50; 4;4–6;8 yrs. n = 25; 10 sess. (20 min)IQ — Esser (1990); in ⱕ 115: n = 12; IQ > 115: Klauer (1991) n = 13; individual training in Klauer (1992b) and N = 16; 4;4–6;8 yrs. n = 8; 15 sess. (15 to 20 — in Klauer & Phye min); IQ > 115; sample of (1994) 8 parallelized pairs Bornemann (1992); in N = 279; 6–7 yrs.; n = 139; 10 sessions; small — Klauer & Phye (1994) first grade “small groups”; size:? Beck, Lu¨bking, & N = 60; 6;8–9;5 yrs.; n = 30; 10 sess. (20 min); — Meier (1995); first grade; Turkish individual training study 2) children Johnen (1988); in N = 29; 5–6 yrs. n = 10; 10 sess. (20 min); n = 10; cognitive and Klauer (1989a) individual training language training Kolmsee (1989); in N = 30; 6–7 yrs. first n = 10; 8 sess. (30 min); n = 9; training in Klauer (1991) grade pairs of children general problem solving Hager & Hasselhorn N = 32; 6;9–7;8 yrs.; n = 16; 10 sess. (20 min); n = 16; parts of a (1993a) deferred from school individual training metalinguistic training for one year Ziesemer (1989); in N = 30; 6–7 yrs.; first n = 10; 10 sess. (20 min); n = 10; 8 sess. (30 Klauer (1989a) grade pairs of children min); pairs of children; training in general problem solving Angerhoefer, N = 40; 12;4–15;3 n = 10; 18 sess. (20 min) (1) n = 10; Kullik, & Masendorf yrs.; mentally retarded multiplication training (1992) (2) n = 10; spatial training Masendorf (1994) N = 30; 11–13 yrs.; n = 10; 12 sess. (20 min); n = 10; spatial training mentally retarded small groups; size:?

Studies

KFT-K(ST 1–4)

Criterion measures

CFT 1 (ST 1–5)

n = 10; classroom activities pre: CMM, post: CFT 2; spatial skills test

n = 10; classroom activities CFT 2; multiplication skills test

n = 10; classroom activities pre: CFT 1, post: CFT 2



KFT-K (ST 2 and 3), CPM n = 10; classroom activities KFT 1–3

n = 9; preschool activities

n = 140; classroom CPM, CFT 1, activities Vocabulary test n = 30; classroom activities CFT 1

KFT-K (ST 2 and 3), CPM n = 68; preschool activities KFT-K (ST 1–4), HAWIVA n = 25; preschool activities; CPM IQ ⱕ 115: n = 15, IQ > 115: n = 10 n = 8; preschool activities CPM

n = 10; preschool activities

n = 13; preschool activities

No-treatment group

Table 1 Summary and Short Description of the 17 Published Evaluation Studies

COGNITIVE TRAINING FOR CHILDREN 417

CTC

Control training

No-treatment group

Subtests of CFT 1, POD

CFT 1, FEW (ST 1–4)

Criterion measures CFT 1,FEW (ST 1–4)

CFT 2

KFT-K (ST 2 and 3), CPM

Criterion measures

Notes: The respective planned comparisons using ANCOVA have been computed by Hager, Elsner, & Hu¨bner (1995); sess.: number of training sessions; (min): approximate duration of each training session; criterion measures: KFT-K, KFT 1–3: German preschool and primary school version of the Cognitive Abilities 왘Test; HAWIVA: German version of the Wechsler Preschool and Primary Scale of Intelligence; CPM: Coloured Progressive Matrices; CFT: Culture Fair Intelligence Test; CMM: Columbia Mental Maturity Scale; FEW: German version of the Developmental Test of Visual Perception; ST: subtest.

n = 11; 10 sess. (20 n = 11; preschool activities min);analytic– systematic procedure and non-inductive tasks Windgasse-Fischer n = 15; 10 sess. (10 n = 15; classroom activities (1991); in Klauer min); small groups; (1991) (component bottom-up strategy “strategy”) (tasks presented with minimal assistance) (c) Comparative evaluations of the Cognitive Training for Children (CTC) with rival or competitive cognitive programs Studies Subjects CTC (1) Rival program; (2) control program Hager & Hasselhorn N = 30; average 6;6 n = 15; 10 sess. (20 min); (1) n = 15; selected Frostig tasks (possibly rival to (1993b) yrs.; deferred from individual training CTC with respect to perceptual problems)10 sess. (20 school for one year min); individual training Hasselhorn & Hager N = 48; 5;11–7;5 yrs.; n = 16; 10 sess. (20 min); (1) n = 16; DenkMit (2) n = 16; selected Frostig (1995) deferred from school individual training tasks 10 sess. (20 min); individual training for one year Hager & Hu¨bner N = 52; 5;10–8;3 yrs.; n = 16; 10 sess. (20 min); (1) n = 17; DenkMit (2) n = 17; memory training (1998) deferred from school individual training (selected tasks) 10 sess. (20 min); individual training for one year including supervision

Training for Children (CTC) n = 11; 10 sess. (20 min); pairs of children; analytic– systematic procedure and inductive tasks N = 45; second grade n = 15; 10 sess. (20 min); small groups (size:?); topdown strategy (training regularly administered)

Subjects

(b) Evaluations of components of the Cognitive Bornemann (1988a); N = 33; 5–6 yrs. in Klauer (1989a) (component “tasks”)

Studies

Table 1 Continued

418 W. HAGER, M. HASSELHORN

COGNITIVE TRAINING FOR CHILDREN

419

Table 2 Results of Non-Comparative Evaluations of the Cognitive Training for Children using No-Treatment or Waiting Groups Criterion measures and studies KFT-K Bornemann (1988a); in Klauer (1989a) Bornemann (1988b); in Klauer (1989a) Beck, Lu¨ttmann, & Rogalla (1993) CPM Bornemann (1988b) Alizadeh et al. (1990); in Klauer (1991) in Klauer (1992b), in Klauer & Phye (1994) Bornemann (1992); in Klauer & Phye (1994) CFT 1 Bornemann (1992) Beck, Lu¨bking, & Meier (1995); (study 2)

Comparisons of the pretest–posttest changes

ST 1–4 s, dI = 1.25

ST 2 and 3 ST 1 ST 4 (inductive) training supported all subtests equally



ns







for both ST ns

s, dI = 0.38

ns

s, dI = 0.54 IQ ⱕ 115: ns; IQ > 115: s, dI = 1.02 IQ > 115: s, dI = 1.47 (sample of 16 children out of 23) s, dI = 0.80 ST 1–5 s, dI = 0.70 s, dI = 0.55

ST 1 and 2 — ns

ST 3–5 — s, dI = 0.77

Notes: KFT-K: German preschool version of the Cognitive Abilities Test; CPM: Coloured Progressive Matrices; CFT: Culture Fair Intelligence Test (ST 1 and 2 perceptual tests, ST 3–5 called inductive by Klauer); ST: subtest; s/ns: comparison between groups was significant/not significant.

Thorndike & Hagen (1971) for preschool children and children of grade 1 to grade 3. Both of these tests consist of the same four subtests. According to Klauer (1990a, p. 153), the subtests 2 (recognizing relations) and 3 (drawing inferences) represent inductive problems, whereas the two remaining subtests (subtest 1: language comprehension, and subtest 4: mathematical reasoning) refer to non-inductive cognitive activities that are not in the range of the effects expected for the CTC. The German version of Culture Fair Test by Cattell, provided as CFT 1 by Weiß & Osterland (1980) for younger children, is a test of general fluid intelligence which consists of five subtests, grouped into two subsets. While subtests 1 and 2 are associated with perceptual and psychomotor abilities, subtests 3, 4, and 5, among others, refer to inductive reasoning abilities and thus seem to be appropriate to assess the attainment of the primary goal of the CTC. The complete CFT 1 can be used as a test of fluid intelligence (Klauer & Phye, 1994, p. 72). The structure of the Culture Fair Test for older children and adolescents (8 to 18-year-old; CFT 2 by Weiß, 1974) differs from the CFT 1 since all four subtests of the CFT 2 exclusively contain inductive tasks according to Klauer, 1992b (p. 65).

420

W. HAGER, M. HASSELHORN

Non-Comparative Evaluations using No-Treatment or Waiting Groups Only The results of non-comparative evaluations using a no-treatment or a waiting group are summarized in Table 3. The table mostly contains technical details concerning some main features of the studies and effect sizes of statistically significant advantages of the CTC group. Effects on the KFT-K In three studies with the KFT-K, a group of children trained with the CTC was compared to a no-treatment group. To answer the question “How does the paradigmatic training in its published form function with preschool children?”, Bornemann (1988a) (in Cranen, 1989, pp. 72-73; cf. Klauer, 1990b, and Klauer & Phye, 1994, pp. 63-64) trained twentyseven 5 to 6 year old preschool children. The 14 children, who had been trained in pairs with the CTC over 15 sessions of about 20 minutes, significantly outperformed the 13 untrained children, who participated in regular preschool activities, on the inductive subtests 2 and 3 of the KFT-K. This was also the case in the subsequent study by Bornemann (1988b) (in Cranen, 1989, p. 73; Klauer & Phye, 1994, p. 64), a “further evaluation of the published training program”, in which ten 5 to 6 year old children were trained with

Table 3 Results of Non-Comparative Evaluations of the Cognitive Training for Children using Control Trainings (Including some Comparisons with No-Treatment Groups) Criterion measures and studies KFT-K Johnen (1988); in Klauer (1989a) KFT 1–3 Kolmsee (1989) CPM Johnen (1988); in Klauer (1989a) CFT 1 Hager & Hasselhorn (1993a) CFT 2 Ziesemer (1989); in Klauer (1989a) Angerhoefer et al. (1992) Masendorf (1994)

Comparisons of pretest–posttest changes

ST 2 (inductive) all comparisons between groups ns

ST 3 (inductive)

ST 2 (inductive) all comparisons between groups ns

ST 3 (inductive)

CTC–NT: s, dI = 1.73; CT–NT: s, dI = 1.13;CTC–CT: s, dI = 0.95 ST 1–5 CTC–CT:

ns

ST 3–5 ns

CTC–NT: s, dI = 0.79; CT–NT: ns; CTC–CT: s, dI = 0.66 all comparisons between groups ns CTC–NT: s, dI = 1.05; CT–NT: ns; CTC–CT: s, dI = 0.92

Notes: KFT-K/KFT 1–3: German preschool/primary school version of the Cognitive Abilities Test; ST: subtest; CPM: Coloured Progressive Matrices; CFT 1: Culture Fair Intelligence Test (ST 3 to 5 called inductive by Klauer, 1989b); CFT 2: Culture Fair Intelligence Test for older children and juveniles; CTC: experimental group trained with the Cognitive Training for Children; NT: no-treatment group; CT: experimental group that joined a control training; s/ns: comparison between groups was significant/not significant; dI: effect size calculated on between groups difference of posttest–pretest differences.

COGNITIVE TRAINING FOR CHILDREN

421

the CTC over 10 sessions of about 20 minutes in pairs, whereas 10 children participated in regular preschool activities. In 1993, Beck, Lu¨ttmann and Rogalla conducted an external evaluation of the effectiveness of the CTC. They trained 72 5- to 7-year-olds over 10 sessions of about 20 minutes in individual training and compared them to 68 untrained children. The comparisons were nonsignificant for the subtests 2 and 3 of the KFT-K. Effects on Raven’s CPM The 1990 study by Alizadeh, Becker, & Esser (1990) on the effects of CTC among gifted children has been published in Klauer (1991, p. 61) and in Klauer & Phye (1994, p. 69) in different ways: In 1991, Klauer reports on fifty 4 to 7-year old children that were allocated to two training groups (10 sessions of about 20 minutes) and two corresponding untrained groups with different intelligence quotients (IQ ⱕ 115 vs. IQ > 115). An ANCOVA showed significant gains in CPM performance for the 25 trained children as compared to the 25 untrained ones. Single post hoc comparisons between the corresponding IQ-groups showed that only the 13 trained highly able children outperformed the corresponding 10 untrained ones, whereas training was not effective for the 12 normally intelligent children. Only the results of 16 gifted children (IQ > 115) are reported in Klauer & Phye (1994). Here, the 8 children of the CTC-group outperformed the 8 untrained children on the CPM. Effect size for the gifted group is larger in the subsample reported in 1994 (dc = 1.24) than in the subsample reported in 1991 (dc = 0.95). Bornemann (1992)(in Klauer & Phye, 1994, pp. 72-73) trained 139 first grade children (6 to 7 years old) over 10 sessions in small groups with the CTC. These children were significantly superior to 140 untrained children in posttest CPM performance. This was also the case in the study by Bornemann (1988b) described above. Effects on the CFT 1 and its Subtests The complete CFT 1 has been used in the non-comparative evaluation studies by Bornemann (1992) and by Lu¨bking (study 2 in Beck et al., 1995, pp. 298-302; study 1 reports the results of Beck et al., 1993). Lu¨bking was interested in whether the CTC could help groups of Turkish children living in Germany, who have been diagnosed as suffering from cognitive deficits, to compensate for the associated problems (Beck et al., 1995, p. 301). She trained 30 Turkish children, who attended first grade in German schools, over 10 sessions of about 20 minutes individually. The children of the CTC-group outperformed the 30 untrained children in overall CFT 1 performance (all five subtests). In accordance with its domain-specific goals, the CTC was not superior with regard to subtests 1 and 2, but with regard to the inductive subtests 3, 4 and 5. As a more detailed analysis showed, this effect was due to a significant difference between the groups in subtest 5 (Matrices), whereas all other differences remained insignificant. The study by Bornemann (1992) will be addressed below, since the subtests have not been analyzed separately. Non-Comparative Evaluations using a Control Program As far as the internal evaluations are concerned, we re-computed the comparisons between the CTC and the control training on the basis of the original raw data, if the

422

W. HAGER, M. HASSELHORN

relevant comparisons could not be found in the literature (cf. Hager et al., 1995, for the details). The results of the respective studies are summarized in Table 4 (see Table 1 for the description of the realized control trainings). Effects on the KFT and the CPM Johnen (1988) (in Cranen, 1989, p. 75; Klauer, 1990a; Klauer & Phye, 1994, pp. 6667), administered the CTC in individual sessions with kindergarten children. 10 children were trained with the CTC over 10 sessions of about 20 minutes, 10 children joined a cognition and language training with commercially available learning material, and 10 untrained children took part in regular kindergarten activities. The CTC did not outperform the control training in this study. In the same study, children trained with the CTC were Table 4 Domain-Specifity and Transfer of the Cognitive Training for Children (CTC) with Respect to “NonInductive” Criterion Measures Criterion measures and studies

Comparisons of pretest–posttest changes

(a) Non-comparative evaluations using no-treatment groups KFT-K ST 1–4 (contains inductive ST) Bornemann (1988a); in Klauer (1989a) overall significant benefits in all four ST, including two non-inductive ones, dI = 1.02 (cf. Table 2) Beck, Lu¨ttmann, & Rogalla (1993) significant benefits only in the non-inductive ST 1, dI = 0.38 (cf. Table 2) HAWIVA Beck, Lu¨ttmann, & Rogalla (1993) ns Vocabulary test Bornemann (1992); in Klauer & Phye, 1994) ns CFT 1 ST 1–5 (contains inductive ST) Bornemann (1992) s, dI = 0.70 (see also Table 2) Beck, Lu¨bking, & Meier (1995); (study 2) s, dI = 0.55 (see also Table 2) (b) Non-comparative evaluations against control programs Multiplication skills test Angerhoefer et al. (1992) all comparisons between groups ns Spatial skills test Masendorf (1994) CTC–NT: s, dI = 0.90; CT–NT: s; dI = 0.84; CTC–CT: ns CFT 1 ST 1-5 (contains inductive ST) Hager & Hasselhorn (1993a) CTC-CT: ns (see also Table 3) (c) Comparative evaluations with a rival or competitive programs Perceptual tests: FEW, POD Hager & Hasselhorn (1993b)(FEW, subtests 1 to 4) both CTC and FT had significant pretest–posttest changes, but comparison between groups ns Hasselhorn & Hager (1995)(FEW, subtests 1 to 4) CTC, DM and FT had significant pretest–posttest gains, but comparison between groups ns Hager & Hu¨bner (1998)(POD) CTC, DM and MT had significant pretest–posttest gains, comparisons between reasoning programs s (dI = 0.5) in favor of CTC — counter to expectations Notes: KFT-K: German preschool version of the Cognitive Abilities Test; HAWIVA: German version of the Wechsler Preschool and Primary Scale of Intelligence; CFT 1: Culture Fair Test, consisting of perceptual and reasoning including inductive subtests; FEW: German version of the Developmental Test of Visual Perception; POD: perceptual test Pru¨fung optischer Differenzierungsleistungen; CTC: experimental group trained with the Cognitive Training for Children; FT: experimental group trained with selected Frostig tasks; DM: experimental group trained with the program DenkMit; see the foregoing tables for further abbreviations.

COGNITIVE TRAINING FOR CHILDREN

423

superior to the control training group, who in turn outperformed untrained children with respect to the CPM. Kolmsee (1989) (in Klauer, 1991, p. 64) trained ten 6 to 7-year-olds pairwise with the CTC over 8 sessions of about 30 minutes. As a control program, she used a training in general problem solving (training of metacognitive skills; Lauth, 1988), in which children learn to use metacognitive strategies. Nine children took part in this control training and 10 untrained children joined regular classroom activities. An ANCOVA showed no significant differences between the three groups, neither in the inductive nor in the non-inductive subtests of the KFT 1–3. Effects on the CFT 1 and its Subtests Hager & Hasselhorn (1993a) were interested in the effectiveness of the CTC with regard to inductive reasoning as well as general fluid intelligence. They trained sixteen 7 to 8year-olds who were deferred from school for one year and who attended special remedial school kindergartens. In the control group, 16 children from the same institutions received a metalinguistic training which is similar to the inductive reasoning program, but which aims at increasing children’s phonological awareness (control program) (see Schneider, Vise´, Reimers, & Blaesser, 1994). Unexpectedly, both treatments produced gains in CFT 1 performance. The CTC turned out not to be superior to the control training. Unexpectedly, the CTC showed the largest pre–post gains in non-inductive (perceptual) subtests 1 and 2, which made Hager and Hasselhorn interpret the effects of the CTC on tests of general intelligence to be caused by improvements in processes of visual perception rather than inductive thinking (this presumption subsequently was called the perception hypothesis). Effects on the CFT 2 In accordance with the study by Kolmsee, Ziesemer (1989)(reported in Cranen, 1989, p. 76, in Klauer, 1990a, and in Klauer & Phye, 1994, p. 68) compared the CTC with Lauth (1988)’s program. Since Lauth’s program aims at acquisition of a more general problem solving strategy and better metacognitive controlling and coping with failures for retarded children, Klauer (1992b) interpreted this program as a control training with respect to the CTC. Although similar aims of the two programs have been stated, and the training tasks in both programs require finding out similarities and dissimilarities, Lauth’s program does not teach inductive reasoning as deriving and applying rules. For this reason, we classified Lauth’s program as a control training with regard to the CTC. Ziesemer trained 10 6 to 7-year old children over 10 sessions of about 20 minutes with the CTC, while 10 children took part in the control training over 8 sessions of about 30 minutes, and 10 untrained children joined formal school instruction. She used the CFT 1 as pretest and the CFT 2 as posttest in order to avoid ceiling effects. The children of the CTC group outperformed the untrained children and the children of the control training group. Comparative Evaluations with Potential Rival or Competitive Programs Hager & Hasselhorn (1993b) and Hasselhorn & Hager (1995, 1996) compared the effects of the CTC to the effects of the German version of the Frostig Developmental

424

W. HAGER, M. HASSELHORN

Program of Visual Perception (Frostig, Horne, & Miller, 1972; German version by Reinartz & Reinartz, 1979). The Frostig program consists of 320 tasks to foster five separate areas of visual perception. The authors chose approximately 40 tasks from the areas 1 to 4. The tasks of the area perception of spatial relationship were omitted because of their similarity to some CTC tasks. Since the Frostig program was developed to obtain better visual perception, it may be interpreted as a control training to the inductive reasoning program CTC. But on the other hand, Klauer’s definition of inductive reasoning as detecting “regularities . . . by recognizing similarities and dissimilarities of attributes or relations with academic content” (Klauer & Phye, 1994, p. 40) includes processes of visual perception as well. Thus, the Frostig tasks can also be interpreted as a rival training in evaluation studies with the CTC (instead of a control program, as had been intended). In Hager and Hasselhorn’s studies, a shortened form of the Frostig program was used, so it constituted a quasi-rival training. This interpretation, however, has the important consequence that the differences between the programs may be small, irrespective of whether the CTC is effective or not. Hager & Hasselhorn (1993b) were interested in clarifying whether the effects of the CTC are due to improvements of processes of inductive reasoning in the sense of the program author’s definition or of visual perception (see below). Subjects in their study were 30 children (average age 6;6 years) who had been deferred from school for one year and who attended special remedial school kindergartens. Fifteen children took part in the CTC, the remaining 15 attended the Frostig program. Inductive reasoning was assessed by the sum of the subtests 3 to 5 of the CFT 1 (Weiß & Osterland, 1980), perceptual speed was measured by the sum of the subtests 1 and 2 of the CFT 1. The children of the CTC group were not superior to the children of the Frostig group in the subtests 3, 4 and 5 of the CFT 1. These data lend support to the interpretation of the Frostig program as potential rival to the CTC, although it does not aim at inductive processes like detecting regularities. As the two programs also showed the same effects on performance in the subtests 1 and 2, the present authors’ perception hypothesis (see above) was supported by these studies. In 1995, Hasselhorn and Hager additionally applied a second rival program that was developed to increase inductive reasoning as well as visual perception (DenkMit by Sydow & Meincke, 1994). Klauer’s CTC and this competitive reasoning training were compared to the above described selection of tasks from the Frostig program. The authors predicted the two reasoning programs to have comparable effects on inductive reasoning, but to be superior to the selected Frostig tasks. Forty-eight 6 to 7-year-old children again from special remedial school kindergartens took part in this study, constituting three training groups with n = 16 subjects each. All programs had similar effects on the perceptual speed subtests 1 and 2 and on the inductive reasoning subtests 3, 4 and 5 of the CFT 1 as well. The reasoning programs neither differed with regard to their effectiveness nor outperformed the shortened Frostig program. A further comparative evaluation of the CTC and the DenkMit was conducted by Hager & Hu¨bner (1998), who used a memory training as a control program which did not contain any inductive tasks. Again, the children with an average age of 6;10 years (range from 5;10 to 8;3 years) attended special school kindergartens. Hager and Hu¨bner applied the inductive subtests 3 to 5 of CFT 1. The two reasoning programs turned out

COGNITIVE TRAINING FOR CHILDREN

425

not to be effective as compared to the control program (memory training), nor did one reasoning program outperform the other. There is at least one further comparative evaluation between CTC and DenkMit (Schmude & Sydow, 1994), but despite various demands for the data we were not able to obtain them from the authors.

Domain Specificity and Transfer Effects of the CTC It is widely agreed that a training as opposed to a pure test coaching program should cause some transfer on tasks and problems dissimilar to the program’s tasks and problems and also have effects outside the intervention context (Belmont & Butterfield, 1977; Brown, Bransford, Ferrara, & Campione, 1983; Hasselhorn, 1995; Klauer, 1989b, 1992a; Sternberg, 1983). In order to assess transfer, the tasks and problems of the criterion variables must differ from those of the program; if this is not the case, it is impossible to decide whether better performances at posttest are more than pure coaching effects (see Anastasi, 1981; Hager & Hasselhorn, 1997). The identification of the program author’s opinion concerning domain specificity of the effects of the CTC is not an easy task. However, it seems convenient to distinguish between two kinds of domains where transfer effects have been evaluated hitherto: the domain of general fluid intelligence (gf) and the domain of crystallized intelligence (gc). Domainspecific effects with regard to gf means that there should be effects on tasks involving general fluid intelligence, with those effects being smaller the less inductive reasoning is necessary to solve the tasks at hand. In contrast, there should be no effects on tasks representing gc. To evaluate this hypothesis concerning the domain specificity of the CTC, the evaluators applied some tests of intelligence or subtests of such tests associated with gc and some subtests associated with gf, which are not associated with inductive reasoning in the sense of deriving and applying a rule. Table 5 presents an overview of the results of the existing non-comparative as well as comparative evaluations concerning the domain specificity hypothesis. Non-Comparative Transfer Evaluations using a Waiting Group Only Effects on the KFT-K Bornemann (1988a) used all four subtests of the KFT-K and reported the results for the non-inductive subtests 1 and 4 of the KFT-K, too. She found that “training [CTC] supported all subtests equally, i.e., there was no indication of a change in profile due to training” (Klauer & Phye, 1994, pp. 63-64). Although one might discuss this result in favor of the CTC, it disagrees with the claim of its domain-specific effectiveness. In the study by Beck et al. (1993), the CTC was not effective with respect to the KFTK at all, apart from a slight effect on subtest 1 (language comprehension), which however is non-inductive. This result again disagrees with the program author’s claims concerning domain-specific transfer effects of the CTC.

426

W. HAGER, M. HASSELHORN Table 5 Results of Follow-Up Studies (Duration of Effects of the Cognitive Training for Children)

Criterion measures and studies

Follow-up period

Comparisons of posttest-follow-up changes

(a) Non-comparative evaluations using no-treatment or waiting groups CPM Bornemann (1992); 6 months ns in Klauer & Phye (1994) CFT 1 ST 1–5 Bornemann (1992) 6 months s, dI = 0.42 (b) Non-comparative evaluations using control programs KFT-K ST 2 and 3 (inductive) Johnen (1988); 4 months ns in Klauer (1990a) CPM Johnen (1988); 4 months CTC–NT: s, dc = 0.99; CT–NT: ns; CTC–CT:? (comparison not in Klauer (1990a) available) (c) Comparative evaluations with competitive programs CFT 1 ST 1–5 ST 1 and 2 ST 3–5 Hager & Hasselhorn 5 months (mean values were corrected for the expected raise in (1993b) performance according to the norms of the test) CTC ns ns s, dI = 0.71 follow-up values were not lower than posttest values Hasselhorn & Hager 6 months posttest-follow-up gains: (1996) CTC: s s ns DM: ns ns ns FT: s ns s all comparisons between groups remained ns FEW ST 1-4 Hager & Hasselhorn 5 months (see above for the correction of the mean values) (1993b) CTC: for both CTC and FT, follow-up values were significantly lower than FT: posttest values Hasselhorn & Hager 6 months CTC, DM and FT had significant posttest-follow-up gains, but (1996) all comparisons between groups were insignificant (d) Evaluations of components of the Cognitive Training for Children (CTC) KFT-K ST 2 and 3 (inductive) Bornemann (1988c); 7 months CTC-NT: ns; CTC-CT:? (comparison not available) (see Klauer (1990a)) CPM Bornemann (1988c); 7 months CTC-NT: s, dc = 0.54; CT-NT: ns; CTC-CT:? (comparison not (see Klauer (1990a)) available) Notes: CPM: Coloured Progressive Matrices; CFT: Culture Fair Intelligence Test; KFT-K: German preschool version of the Cognitive Abilities Test; FEW: German version of the Developmental Test of Visual Perception; ST: subtest; s/ns: comparison between groups was significant/not significant; dc: effect size corrected for differences in the pretest measures as given by the authors; d = ?: effect size could not be computed. CTC: experimental group trained with the Cognitive Training for Children; NT: no-treatment group; CT: experimental group joining a control training; DM: experimental group trained with the reasoning program DenkMit; FT: experimental group trained with selected Frostig tasks.

COGNITIVE TRAINING FOR CHILDREN

427

Effects on the HAWIVA The Hannover Wechsler Intelligenztest fu¨r das Vorschulalter (HAWIVA) (Eggert & Schuck, 1975) is the German version of the Wechsler Preschool and Primary Scale of Intelligence (WPPSI) (Wechsler, 1967), which is based on a conception of intelligence that does not differentiate between fluid and crystallized (non-inductive) intelligence but proposes overall intelligence to be composed of a verbal and a performance part. As inductive problem solving processes are not explicitly required in the HAWIVA, Beck et al. (1993) expected no effects of the CTC on this test. According to this expectation, the CTC group did not outperfrom the untrained group on the HAVIWA in their study. Surprisingly, the untrained group showed better posttest results than the trained children in the subtests Geometric Design and Animal House. This study could be interpreted as evidence of the domain-specific effectiveness of the CTC, because the program did not produce transfer effects on this measure of a different conception of intelligence which does not require inductive reasoning processes. On the other hand, the results show that the CTC did not enhance intelligence in general. The children were not able to use the trained strategies successfully in the HAWIVA. Moreover, the program did not show any effects on variables that are inductive in the program author’s definition. This makes clear interpretation of this study particularly difficult. Effects on a Vocabulary Test In the same study by Bornemann (1992), a vocabulary test was also administered, which undoubtedly refers to abilities in the realm of gc. The CTC was not expected to influence crystallized intelligence (gc), that is, the performance in the vocabulary test. As expected, the comparison between the CTC group and the untrained group was not significant. Although this result is in accordance with the domain specificity hypothesis, it may be a matter of debate whether a vocabulary test constitutes an appropriate measure to show convincingly that no transfer takes place on gc with the CTC. Measures of word fluency and/or of verbal comprehension (see Horn, 1985, p. 288) would seem to be more appropriate and convincing. Non-Comparative Transfer Evaluations using a Control Program Angerhoefer et al. (1992; cf. Klauer & Phye, 1994, p. 74) and Masendorf (1994) applied the CTC to mildly mentally retarded adolescents. The authors were interested in both the general effectiveness of the CTC among retarded persons and its domain-specific transfer effects with regard to inductive reasoning. For retarded persons especially individual advancement is interesting, so the data were statistically analyzed by use of configuration frequency analysis (Lienert, 1988). As the results cannot be compared directly to those of the other evaluation studies, Hager et al. (1995, pp. 278-281) conducted planned comparisons on the means adjusted by ANCOVA for both studies. Effects on a Multiplication Skills Test and on the CFT 2 Angerhoefer et al. (1992) trained 10 mildly retarded adolescents with the CTC over 18 sessions of about 20 minutes and compared this group to two control trainings: 10 ado-

428

W. HAGER, M. HASSELHORN

lescents joined a computer-assisted training of multiplication skills and another 10 retarded adolescents received a computer-assisted training of spatial skills. Ten untrained subjects took part in regular classroom activities. Angerhoefer et al. expected the CTC to have effects on the CFT 2 only and not on the multiplication skills test they also applied, whereas the multiplication training ought to have effects on the latter test, but not on the CFT 2. The spatial training should not have effects on either test. The authors only compared the CTC group to the multiplication training group. Configuration frequency analysis revealed that six adolescents of the CTC group showed gains in the CFT 2, but not in the additionally conducted multiplication skills test, and the authors interpret this result corroborating the claim of domain-specific effectiveness of the CTC with regard to inductive reasoning. Only two subjects of the control training group showed gains in CFT 2 and no gains in the multiplication skills test, so the authors assumed the multiplication training not to be effective on the CFT 2. In Hager et al., 1995’s (p. 278-279) reanalysis of this study, all comparisons between the groups were non-significant by ANCOVA: Neither did the CTC have any effects on the CFT 2, nor did the multiplication skills training have any effects on the multiplication test. Effects on a Spatial Skills Test In a further study, Masendorf (1994) trained 10 adolescents over 12 sessions of about 20 minutes in small groups with a computer version of the CTC. Ten subjects took part in a computer-assisted training of spatial skills as control training, 10 further subjects remained untrained. At pretest the Columbia Mental Maturity Scale (Eggert & Schuck, 1971) and the spatial skills test were administered, and the CFT 2 as well as the spatial skills test were given at posttest. According to configuration frequencies analysis on the residual scores, the CTC improved CFT 2 performances whereas spatial training did not. Hager et al., 1995’s (p. 278-279) re-analysis using ANCOVA showed both the CTC group and spatial skills group to be significantly superior to the untrained group in spatial skills test performance. In addition, the CTC was significantly superior to the spatial skills training and to the untrained group in CFT 2 performance. Thus, with regard to transfer on spatial skills the CTC was as effective as the training of spatial skills. These results again are in conflict with the program author’s claims concerning the domain-specific effectiveness of the CTC. Comparative Transfer Evaluations with a Potential Rival Program Effects on the FEW In their studies, Hager & Hasselhorn (1993b) and Hasselhorn & Hager (1995) also applied the German version of the Frostig Developmental Test of Visual Perception by Frostig, Maslow, Lefever, & Whittlesey (1964), provided in German language as FEW by Lockowandt (1993) to assess accuracy of visual perception. As noted above, tasks of perception of spatial relationship were not trained, and so subtest 5 of the FEW was not applied. Since test items of the FEW and training tasks of the Frostig program are largely equivalent, the authors expected the Frostig program to enhance FEW performance more than the CTC. Nevertheless, under the perception hypothesis, one may expect transfer

COGNITIVE TRAINING FOR CHILDREN

429

effects of the CTC on the FEW, but they should be smaller than the effects of the shortened Frostig program. In Hager & Hasselhorn (1993b), both the CTC and the Frostig group had significant gains in FEW performance, but the comparison between the groups was not significant, indicating that both programs did not differ with respect to perceptual accuracy. In the study by Hasselhorn & Hager (1995), the CTC as well as the program DenkMit by Sydow & Meincke (1994) showed significant gains in the FEW, but the selected Frostig tasks unexpectedly did not. With respect to the FEW, the gains of the reasoning programs did not differ significantly. In this study, the two reasoning programs appeared to be more effective than the truncated version of the traditional Frostig program in enhancing accuracy of visual perception as assessed by the FEW. Effects on the Subtests 1 and 2 of the CFT 1 An alternative measure of visual perception is given with the subtests 1 and 2 of the CFT 1 (see above). In the study by Hager & Hasselhorn (1993b), the CTC and the selected tasks of the Frostig program of visual perception revealed the same improvements on performance in the subtests 1 and 2, which is in conflict with the claim of domain specificity of the effects of the CTC regarding inductive reasoning. Effects on the POD Finally, in the comparative evaluation of Hager & Hu¨bner (1998) mentioned above, the Pru¨fung optischer Differenzierungsleistungen (POD) by Sauter (1979) was applied as a test of visual perception. Children trained with the CTC outperformed those of the control group, and — counter to expectations — they even outperformed the children trained with the DenkMit, although this program has been developed to foster children’s perceptual abilities, too. This lent further support to the perception hypothesis of the present authors and does not corroborate Klauer’s claims as to the domain-specific effectiveness of the CTC concerning inductive reasoning, although one might argue that increases in visual perception are compatible with the more general claim of CTC’s domain-specific effects on fluid intelligence.

Componential Evaluations Evaluations of components of a program are used to clarify which components of the program are indispensible for its effectiveness. Since the CTC consists of 120 tasks, special demands for each task, and some more general strategies of problem solving, it is interesting to know which of these or other components are most important for its overall effectiveness. In the study by Bornemann (1988c) (in Cranen, 1989, pp. 73-75; Klauer, 1990a; Klauer & Phye, 1994, pp. 65-66), the design permitted testing if the CTC’s specific tasks are an effective component of the program. Eleven 5 to 6-year-old children were trained with the CTC over 10 sessions of about 20 minutes in pairs, 11 untrained children joined regular kindergarten activities and further 11 children applied the same analytic–systematic proceeding as the CTC group to noninductive intelligence test problems. These were “like-

430

W. HAGER, M. HASSELHORN

wise intellectual challenging tasks . . . mosaics, mazes, fill-in tasks and drawings that demand a careful proceeding and an analytical inspection in the same amount as our training tasks” (Klauer, 1990a, p. 153; translated by the present authors). Since the analytic–systematic procedures were very similar in both groups varying only with regard to the tasks, this study can be interpreted as an evaluation of the program component tasks. The CTC group was significantly superior to the second training group in performance on the inductive subtests 2 and 3 of the KFT-K (Heller & Geisler, 1983a); the second training was not effective, i.e., the comparison between this group and the no-treatment group remained insignificant. This may be a hint that the inductive tasks are more important for the effectiveness of the program than the analytic–systematic strategies. With regard to the additionally assessed criterion variable CPM (Becker et al., 1980), children trained with the CTC were superior to the children of the second training group, whose children outperformed the untrained children. Thus, the CPM results also support the hypothesis that the specific tasks are an effective component of the CTC. In the study by Windgasse-Fischer (1991)(in Klauer, 1991, pp. 65-66; Klauer & Phye, 1994, pp. 69-71), the program component strategy was varied in two training groups and compared to a no-treatment group. The question was: “Which contribution of the...[CTC] is due to the training method and which to the training material?” (Klauer, 1991, p. 65). Fifteen second grade children were trained with the CTC in its original form (top-down strategy) over 10 sessions of about 20 minutes in groups of two or three children: “teaching reasoning by making use of analytical comparisons” (Klauer & Phye, 1994, p. 69). The 15 children of the second training group received the inductive training tasks with minimal assistance over 10 sessions of about 10 minutes in small groups, using a bottom-up strategy. Fifteen untrained children took part in regular classroom activities. Both training groups outperformed the untrained children with regard to the CFT 2 (Weiß, 1974), but did not differ significantly from each other. This is a surprising result when we take into account that the children of the second training group (bottom-up strategy) were trained only half as long as the CTC group. Apparently, the original strategy was not more effective than no strategy. Thus, the importance of the tasks was supported again, whereas some doubts were raised with regard to the utility of the special analytical strategies of the CTC.

Duration of the Effects of the CTC It is essential for any training program to provide effects that extend far beyond the intervention phase (Belmont & Butterfield, 1977; Hager & Hasselhorn, 1995a; Sternberg, 1983). The fact that any effect will diminish if not retrained, should not be taken as a justification for not exploring effects of duration. Since available theories do not make specific assumptions about the durability of training effects, their duration should be assessed and reported whenever possible. In this article, we restrict our considerations to comparisons between the posttest assessment and the follow-up, although there are many further comparisons possible that are of great theoretical interest (see Hasselhorn, 1995). In 5 of the 17 studies of the present meta-evaluation, a follow-up was conducted a few months after completion of the CTC. The results of these studies are contained in Table 6.

COGNITIVE TRAINING FOR CHILDREN

431

Long-Term Effects on the KFT-K In the study by Johnen (1988), the three experimental groups (CTC, control training cognition and language training, and no-treatment group) were retested 4 months after the training was completed. In a comparison between posttest and follow-up performance, Johnen found no significant superiority of the CTC on the subtests 2 and 3 of the KFTK (Heller & Geisler, 1983a) associated with inductive reasoning. In the component evaluation study by Bornemann (1988c) subtests 2 and 3 of the KFT-K were administered again seven months after finishing the training to all three groups (CTC; second training analytic–systematic proceeding and non-inductive tasks, and no-treatment group). The CTC showed no significant duration effects on performance in the subtests 2 and 3 of the KFT when compared to the no-treatment group. Long-Term Effects on the CPM In three studies, a follow-up assessment with the CPM (Becker et al., 1980) was conducted. The children of the CTC group outperformed the untrained children and the children of a control training group in the studies by Bornemann (1988c) and by Johnen (1988) just mentioned. In the third study by Bornemann (1992), the trained children did not outperform the untrained children 6 months after the training, the unstandardized difference between the four means (two groups and two measures per group) only being d = 0.1 (cf. Klauer & Phye, 1994, pp. 73-74). Long-Term Effects on the CFT 1 Hager & Hasselhorn (1993b) conducted a follow-up assessment about 5 months after completion of the training. Regarding the overall CFT 1 score, follow-up performance of the CTC group was not reduced as compared to the posttest performance, indicating that the effects did not diminish. An analysis of the single subtests showed that according to the present authors’ perception hypothesis follow-up performance of the CTC group in subtests 1 and 2 of the CFT 1 did not drop below the posttest values, whereas follow-up performance on the inductive subtests 3 through 5 was smaller than or equal to posttest performance. In another study, Hasselhorn & Hager (1996) retested the children 6 months later. With regard to the overall CFT 1 performance, the program DenkMit did not outperform the rival reasoning program CTC. According to the results of the subtests 1 and 2 or 3 through 5, respectively, all comparisons between the three groups turned out insignificant. The tailored Frostig program was not superior to the CTC nor to the DenkMit in the perceptual speed subtests 1 and 2, and the two reasoning programs did not outperform the Frostig program with regard to the inductive subtests 3 to 5. In the study by Bornemann (1992) the overall test score of the CFT 1 was used as a further criterion variable. The trained children still outperformed the untrained group 6 months after the training. The CTC had long-term effects on the composite CFT 1 score, but because of lack of relevant information, it cannot be decided which subtests may be responsible for the success of the CTC.

432

W. HAGER, M. HASSELHORN

Long-Term Effects on the FEW In the study by Hager & Hasselhorn (1993b), FEW performance of the CTC group at follow-up had dropped below the posttest performance. As there had been a significant effect of the CTC on the FEW immediately after the training which vanished within 5 months, Hager and Hasselhorn argued that the effects of their study were primarily due to coaching or mere practice. The study of Hasselhorn & Hager (1996) gave different results: All three groups showed significant gains from posttest to follow-up, but the comparisons between the three groups were not significant. Thus, although the three programs yielded substantive long-term effects, no advantages for the CTC could be found.

Discussion Despite the apparent complexity and diversity of the results of the studies considered, a fairly well established conclusion from the present meta-evaluation of the Cognitive Training for Children is that it enhances performance in traditional tests of fluid intelligence. This conclusion is in agreement with the suggestions provided by various internal meta-analyses (see Klauer, 1990b Klauer, 1991, 1993, 1994, 1995; Klauer & Phye, 1994) concerning the overall effectiveness of the program. Effects of this kind, however, can be obtained by various other available programs. In addition, it stands to reason whether these effects are due to an increase in children’s inductive reasoning competences or not. That is, important issues about how and why it works (which is of interest for educational researchers in general) as well as about where and when the program works (which is of interest for potential users) mostly remain unresolved. In contrast to the internal metaanalyses provided by the program author, the present meta-evaluation provides some related information, especially concerning the how and why of the effects of the CTC. The two available componential evaluations can provide a first answer to the question why the CTC works. Although more detailed componential analyses are conceivable and desirable, the studies by Bornemann (1988c) and by Windgasse-Fischer (1991) give some empirical evidence for the conclusion that the effectiveness of the CTC mainly rests on the training tasks instead of on the strategies used to instruct the process of comparing attributes and relationships of objects and/or events. That is, the tasks seem to constitute the procedural core component of the CTC which is undispensable for the program’s effectiveness. Thus, it seems appropriate to explore if it is possible to find alternative instructional strategies in order to achieve higher levels of effectiveness with modified versions of the CTC. Returning to the present version of the CTC (Klauer, 1989a; Klauer & Phye, 1994), there are two issues surrounding the problem of how the program works that deserve further discussion. First of all, although there is no reason to have any doubt about the program author’s claim that the CTC enhances children’s looking for similarities and dissimilarities among objects and events, the present meta-evaluation contributes to the conjecture that the effects of the CTC are not necessarily associated with increases in children’s inductive reasoning in the sense of the ability to detect (and apply) rules or to find regularities. Rather, the alternative interpretation of the effects first mentioned by Hager &

COGNITIVE TRAINING FOR CHILDREN

433

Hasselhorn, 1993a, Hager & Hasselhorn, 1993b seems justified. According to this interpretation, training children to look for similarities and dissimilarities among objects, events, and relations results in an increase in the speed and accuracy of their visual perception. This, in turn, leads to better performances on criterion variables depending on speed and accuracy in visual perception as is the case with conventional tests of (fluid) intelligence. The present authors’ perception hypothesis may also be a better tool to explain the specific patterns of transfer effects demonstrated for the CTC. Since perceptual processes are involved not only in inductive reasoning tasks and other test items attributed to fluid intelligence, increases in perceptual abilities should lead to the domain-specific effects summarized in Table 3. Although most of the reported specific effects of the CTC may also be attributed to increases in children’s inductive reasoning ability, the available evidence concerning the effects of the CTC on pure measures of perceptual speed or perceptual accuracy (Hager & Hasselhorn, 1993b; Hasselhorn & Hager, 1995; Hager & Hu¨bner, 1998) as well as on spatial performance (Masendorf, 1994) cannot be explained by the enhancement of inductive reasoning without additional assumptions. However, all of the presented results, including Masendorf’s with regard to the effects of the CTC on spatial skills, can easily be explained by the present authors’ perception hypothesis. Thus, our answer to the question how the CTC works is that it enhances the efficiency and, perhaps, the proficiency of children’s visual perception. Additional increases in the ability to detect rules and to find regularities are also possible, although not guaranteed nor convincingly demonstrated empirically by the evaluations of the present version of the CTC for young children. Some further conclusions, which may be of interest for the potential program user, seem appropriate. First of all, the joint result of all non-comparative evaluations where the CTC was compared to a no-treatment or waiting group was that application of the CTC is better than to do nothing with the children besides their regular school lessons. The program enhances most children’s performance in tests of fluid intelligence. And there are no hints that there are any negative side-effects associated with the CTC. However, a much more interesting question from the perspective of a potential user of the CTC is whether the program is more effective than competitive cognitive programs for children whose application may be easier, which may be less expensive or which may be successfully given to intact groups or classes. This question is not as easy to answer as the question of the basic or non-comparative effectiveness of the CTC. The results of the few empirical comparative evaluations give rise to the answer: it depends! For example, it depends on the rival programs taken into consideration. While Johnen (1988) reported an advantage of the CTC as compared to a cognitive and language training program compiled ad hoc from commercially available learning materials, the results are at variance when comparing the effects of the CTC with Lauth (1988)’s training of general problem solving, or with a computer-assisted training of spatial skills. However, in all comparisons of the CTC with rival programs developed to foster children’s inductive reasoning (DenkMit by Sydow & Meincke, 1994) or their visual perception (the Frostig program edited in German language by Reinartz & Reinartz, 1979), the hypothesis of superiority of the CTC could not be confirmed. This does not imply that the CTC is ineffective, nor that the program is of modest utility. However, a proper appraisal of the CTC must take into account that its effectiveness is not outstanding, but rather within the range of other

434

W. HAGER, M. HASSELHORN

existing cognitive programs which may be more efficient if we also consider the costs of the programs and the ease of their implementation. This conclusion should motivate educational researchers and practitioners to take the successful tasks of the CTC and to develop and evaluate powerful modifications of the instructional procedures in order to obtain higher levels of effectiveness. Some comments on the uncertainty regarding the durability of the benefits caused by the CTC seem also to be appropriate. Due to the associated expenditure, long-term effects of cognitive programs are assessed not as often as seems advisable or even necessary, although there are such long-term assessments in the cognitive training literature (e.g., Adey & Shayer, 1993, and Woodhead, 1988, to name just two examples). This is somewhat surprising, since most cognitive training programs aim at developing or fostering some kind of ability in the trained subjects. Thus, a certain long-term durability of the training effects should be observable. With regard to this overall shortcoming, the research program concerning the effectiveness of the CTC (including internal and external evaluations) constitutes a notable exception, since three internal and two external evaluation studies include a follow-up a few months after finishing the program to evaluate the long-term effects of the CTC. By and large, the results of these studies are promising. Especially with regard to the criterion variables CPM and overall CFT 1, not only significant increases in the performances were obtained from pretest to posttest, but also from posttest to follow-up, which would be expected if the program accelerates children’s cognitive development (see also Adams, 1989; Hasselhorn, 1995). However, the comparative evaluations contrasting the CTC with competitive programs, thus controlling not only for simple retest effects and general developmental increases, but also for unspecific treatment effects jointly, yielded no superiority of the CTC, again. Thus, although the durability of the benefits caused by the CTC seems to be given, its long-term effects both on perceptual speed and on fluid intelligence are not superior to those of available rival programs. Thus, we would recommend that further efforts would be sensible to optimize the present version of the CTC by inventing more powerful instructional procedures to further improve the effectiveness of the program. Acknowledgements—The authors wish to thank Dipl.-Psych. Birgit Elsner for her help in assembling and structuring the data from most of the empirical studies and for designing most of the tables.

References Adams, M. (1989). Thinking skills curricula: Their promise and progress. Educational Psychologist, 24, 25–77. Adey, P., & Shayer, M. (1993). An exploration of long-term far-transfer effects following an extended intervention program in the High School Science Curriculum. Cognition and Instruction, 11, 1–29. Anastasi, A. (1981). Coaching, test sophistication, and developed abilities. American Psychologist, 36, 1086– 1093. Angerhoefer, U., Kullik, U., & Masendorf, F. (1992). Denk- und Rechenfo¨rderung lernbeeintra¨chtigter Kinder: ¨ nderungsbeurteilung mittels Pra¨diktions-KFA [Cognitive and numerical training for learning Multivariate A disabled children: Multivariate evaluation of changes using prediction configural frequency analysis]. Psychologie in Erziehung und Unterricht, 39, 190–195. Beck, M., Lu¨bking, M., & Meier, U. (1995). Die Bielefelder Studien zum Denktraining von Klauer [The Bielefeld studies on Klauer’s training program for thinking]. In W. Hager (Ed.), Programme zur Fo¨rderung des Denkens bei Kindern (pp. 294-308). Go¨ttingen: Hogrefe.

COGNITIVE TRAINING FOR CHILDREN

435

Beck, M., Lu¨ttmann, B., & Rogalla, U. (1993). Wenn Du denkst, Du denkst . . . Eine Untersuchung der Effektivita¨t des Klauer’schen Denktrainings [When you think that you think: A study of the effectiveness of Klauer’s training in inductive thinking]. Zeitschrift fu¨r Entwicklungspsychologie und Pa¨dagogische Psychologie, 15, 297–306. Becker, P., Schaller, S., & Schmidtke, A. (1980). CPM — Coloured Progressive Matrices (Manual) (2nd ed.). Weinheim: Beltz. Belmont, J. M., & Butterfield, E. C. (1977). The instructional approach to developmental cognitive research. In R. V. Kail and J. W. Hagen (Eds.), Perspectives on the development of memory and cognition (pp. 437-481). Hillsdale, NJ: Erlbaum. Bloom, B. S. (1984). The 2 sigma problem: The search for methods of group instruction as effective as one-toone tutoring. Educational Researcher, 13(6, 4–16. Brown, A. L., Bransford, J. D., Ferrara, R. A., & Campione, J. C. (1983). Learning, remembering, and understanding. In J. H. Flavell & E. M. Markham (Eds.), Cognitive development (Handbook of child psychology, P. H. Mussen (Ed.) (Vol. 3, 4th ed., pp. 77-166)). New York: Wiley. Bu¨chel, F. P. (1992). Test des induktiven Denkens. Ein Verfahren zur Evaluation von Lern- und Transfereffekten [Test of inductive reasoning: An instrument for the evaluation of learning and transfer effects]. In U. Gerhard (Ed.), Psychologische Erkenntnisse zwischen Philosophie und Empirie (pp. 137-158). Bern: Huber. Cattell, R. B. (1963). Theory of fluid and crystallized intelligence: A critical experiment. Journal of Educational Psychology, 54, 1–22. Chow, S. L. (1996). Statistical significance. London, UK: Sage. Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). New York: Academic Press. Cook, T. D., & Gruder, C. L. (1978). Metaevaluation research. Evaluation Quarterly, 2, 5–51. Cranen, I. (1989). Zusammenstellung der Ergebnisse der bisherigen Trainingsexperimente [Summary of the results of the prevailing experiments on the Klauer training program]. In K.J. Klauer, Denktraining fu¨r Kinder I (pp. 50-76). Go¨ttingen: Hogrefe. Eggert, D., & Schuck, K. D. (1971). Columbia Mental Maturity Scale CMM-LB. Weinheim: Beltz. Eggert, D., & Schuck, K. D. (1975). HAWIVA. Hannover Wechsler Intelligenztest fu¨r das Vorschulalter [The Hannover Wechsler Intelligence Test for preschool children]. Bern: Huber. Friedrich, H. F., & Mandl, H. (1992). Lern- und Denkstrategien - ein Problemaufriß [Learning and thinking strategies — A view on the problem]. In H. Mandl & H.F. Friedrich (Hrsg), Lern- und Denkstrategien. Analyse und Intervention (S. 3-54). Go¨ttingen: Hogrefe. Frostig, M., Horne, D., & Miller, A. M. (1972). The developmental program in visual perception (rev. ed.). Chicago, IL: Follett. Frostig, M., Maslow, P., Lefever, D. W., & Whittlesey, J. R. B. (1964). The Marianne Frostig developmental test of visual perception (1963 standardization). Palo Alto, CA: Consulting Psychologists Press. Hager, W. (1995). Planung und Durchfu¨hrung der Evaluation von kognitiven Fo¨rderprogrammen [Planning and realization of evaluation research on cognitive training programs]. In W. Hager (Ed.), Programme zur Fo¨rderung des Denkens bei Kindern. Konstruktion, Evaluation und Metaevaluation (pp. 100-206). Go¨ttingen: Hogrefe. Hager, W. (1996). Kriterien der Effektivita¨t von Trainingsprogrammen [Criteria of the effectivity of training programs]. In E. Witruk & G. Friedrich (Eds.), Pa¨dagogische Psychologie im Streit um ein neues Selbstversta¨ndnis (pp. 259-267). Landau: Empirische Pa¨dagogik. Hager, W., Elsner, B., & Hu¨bner, S. (1995). Metaevaluation von Evaluationen einiger kognitiver Trainings [A metaevaluation of the evaluations of some cognitive training programs]. In W. Hager (Ed.), Programme zur Fo¨rderung des Denkens bei Kindern. Konstruktion, Evaluation und Metaevaluation (pp. 257-291). Go¨ttingen: Hogrefe. Hager, W., & Hasselhorn, M. (1993). Evaluation von Trainingsmaßnahmen am Beispiel von Klauers Denktraining fu¨r Kinder [Evaluation of training programs: Klauer’s program for training children in inductive thinking]. Zeitschrift fu¨r Entwicklungspsychologie und Pa¨dagogische Psychologie, 15, 307–321. Hager, W., & Hasselhorn, M. (1993). Induktives Denken oder elementares Wahrnehmen? Pru¨fung von Hypothesen u¨ber die Art der Wirkung eines Denktrainings fu¨r Kinder [Inductive thinking or basic perception? Testing hypotheses regarding the effects of a training program for children’s thinking]. Empirische Pa¨dagogik, 7, 421–458. Hager, W., & Hasselhorn, M. (1995a). Konzeption und Evaluation von Programmen zur kognitiven Fo¨rderung: ¨ berlegungen [Concept and evaluation of cognitive training programs: Theoretical considertheoretische U

436

W. HAGER, M. HASSELHORN

ations]. In W. Hager (Ed.), Programme zur Fo¨rderung des Denkens bei Kindern. Konstruktion, Evaluation und Metaevaluation (pp. 41-85). Go¨ttingen: Hogrefe. Hager, W., & Hasselhorn, M. (1995). Zuwendung als Faktor der Wirksamkeit kognitiver Trainings fu¨r Kinder [Effects of special attention directed toward children as a determinant of the effectiveness of cognitive training programs]. Zeitschrift fu¨r Pa¨dagogische Psychologie, 9, 163–179. Hager, W., & Hasselhorn, M. (1997). Wirkungen der Testwiederholung und der Entwicklung bei der Durchfu¨hrung des CFT 1 bei Erstkla¨ßlern [Retest effects and developmental changes on the CFT 1 in first graders as a consequence of repeated test applications]. Zeitschrift fu¨r Psychologie, 205, 205–229. Hager, W., Hasselhorn, M., & Hu¨bner, S. (1995). Induktives Denken und Intelligenztestleistung — Analysen zur Art der Wirkung zweier Denktrainings fu¨r Kinder [Inductive reasoning and achievement in intelligence tests. Analysis of effects of two training programs for thinking in children]. Praxis der Kinderpsychologie und Kinderpsychiatrie, 44, 296–302. Hager, W., & Hu¨bner, S. (1998). Denkfo¨rderung und Strategieverhalten: eine vergleichende Evaluation zweier Denkfo¨rderprogramme fu¨r Kinder [Enhancing thinking and strategy use: A comparative evaluation of two training programs for children]. Praxis der Kinderpsychologie und Kinderpsychiatrie, 47, in press. Hasselhorn, M. (1995). Kognitive Trainings: Grundlagen, Begrifflichkeiten und Desiderate [Cognitive training: Foundations, concepts, desideratas]. In W. Hager (Ed.), Programme zur Fo¨rderung des Denkens bei Kindern. Konstruktion, Evaluation und Metaevaluation (pp. 14-40). Go¨ttingen: Hogrefe. Hasselhorn, M., & Hager, W. (1995). Neuere Programme zur Denkfo¨rderung bei Kindern: Wie effektiv sind sie im Vergleich zu herko¨mmlichen Wahrnehmungsu¨bungen? [Recent programs for improving children’s thinking: Are they more effective than traditional exercises of visual perception?]. Psychologie in Erziehung und Unterricht, 42, 221–233. Hasselhorn, M., & Hager, W. (1996). Neuere Programme zur Denkfo¨rderung bei Kindern: Bewirken sie gro¨ßere Kompetenzsteigerungen als herko¨mmliche Wahrnehmungsu¨bungen? [Recent programs for improving children’s thinking: Do they produce more gains in competence than traditional exercises of visual perception?]. Psychologie in Erziehung und Unterricht, 43, 169–181. Heller, K., & Geisler, H.-J. (1983a). KFT-K - Kognitiver Fa¨higkeits-Test (Kindergartenform) [Cognitive Abilities Test for preschool children]. Weinheim: Beltz. Heller, K., & Geisler, H.-J. (1983b). KFT 1-3 - Kognitiver Fa¨higkeits-Test (Grundschulform) [Cognitive Abilities Test for elementary school children]. Weinheim: Beltz. Horn, J. L. (1985). Remodeling old models of intelligence. In B.B. Wolman (Ed.), Handbook of intelligence. Theories, measurements and applications (pp. 267-300). New York: Wiley. Hunt, E. (1974). Quote the Raven? Nevermore. In L.W. Gregg (Ed.), Knowledge and cognition (pp. 129-157). Potomac, MI: Erlbaum. Klauer, K. J. (1989a). Denktraining fu¨r Kinder I [Cognitive Training for Children I]. Go¨ttingen: Hogrefe. Klauer, K. J. (1989b). Die Messung von Transferdistanzen. Ein Verfahren zur Bestimmung der Una¨hnlichkeit von Aufgabenanforderungen [The measurement of transfer distance. A procedure for determining dissimilarities of task requirements]. Zeitschrift fu¨r Entwicklungspsychologie und Pa¨dagogische Psychologie, 21, 146-166. Klauer, K. J. (1990). Denktraining fu¨r Schulanfa¨nger: Ein neuer Ansatz zur kognitiven Fo¨rderung [Training in thinking for beginning school children: A new approach to cognitive stimulation]. Praxis der Kinderpsychologie und Kinderpsychiatrie, 39, 150–156. Klauer, K. J. (1990). A process theory of inductive reasoning tested by the teaching of domain-specific thinking strategies. European Journal of Psychology of Education, 5, 191–206. Klauer, K. J. (1991). Denktraining fu¨r Kinder II [Cognitive training for children II]. Go¨ttingen: Hogrefe. ¨ ber die Transferwirkungen zweier Strategien zum TrainKlauer, K. J. (1992a). “Bottom up” oder “top down”? U ing des induktiven Denkens [Bottom-up or top-down? On transfer effects in two strategies for training inductive reasoning] Sprache und Kognition, 11, 91-103. Klauer, K. J. (1992b). Problemlo¨sestrategien im experimentellen Vergleich: Effekte einer allgemeinen und einer bereichsspezifischen Strategie [An experimental comparison of problem-solving strategies: The effects of a general and a content-specific strategy]. In H. Mandl & H. F. Friedrich (Eds.), Lern- und Denkstrategien. Analyse und Intervention (pp. 57-78). Go¨ttingen: Hogrefe. Klauer, K. J. (1993). Denken und Lernen bei Lernbehinderten: Fo¨rdert das Training des induktiven Denkens schulisches Lernen? [Thinking and learning in slow learners: Effects of training inductive reasoning on school learning]. Heilpa¨dagogische Forschung, 19, 50–67. ¨ ber den Einfluß eines Trainings zum induktiven Denken auf Variablen der fluiden IntelligKlauer, K. J. (1994). U

COGNITIVE TRAINING FOR CHILDREN

437

enz und des Lernens bei a¨lteren Menschen [Effects of an inductive reasoning training program on variables of fluid intelligence and learning in elderly persons]. Zeitschrift fu¨r Gerontopsychologie und -psychiatrie, 7, 29–46. Klauer, K. J. (1995). Antworten und neue Befunde [Answers and new results]. Zeitschrift fu¨r Pa¨dagogische Psychologie, 9, 13–23. Klauer, K. J. (1996). Teaching inductive reasoning: Some theory and three experimental studies. Learning and Instruction, 6, 37–57. Klauer, K. J., & Phye, G. D. (1994). Cognitive training for children. A developmental program of inductive reasoning and problem solving. Seattle: Hogrefe and Huber. Klauer, K. J., Resing, W., & Slenders, A.P.A.C. (1995). Cognitieve training voor kinderen. Ontwikkelet van het inductief redeneren bij kinderen [Cognitive training for children. Development of children’s inductive reasoning]. Go¨ttingen: Hogrefe. Lauth, G. (1988). Trainingsmanual zur Vermittlung kognitiver Fertigkeiten bei retardierten Kindern [Manual of the training of cognitive skills in retarded children] (2nd ed.). Tu¨bingen: Deutsche Gesellschaft fu¨r Verhaltenstherapie. Lienert, G. A. (1988). Angewandte Konfigurationsfrequenzanalyse [Applied configuration frequency analysis]. Frankfurt/M.: Athena¨um. Lockowandt, O. (1993). Frostigs Entwicklungstest der visuellen Wahrnehmung — Manual [Frostig’s developmental test of visual perception] (7th ed.). Weinheim: Beltz. Masendorf, F. (1994). Fo¨rderungstypen des induktiven Denkens und des ra¨umlichen Vorstellens bei lernbeeintra¨chtigten Kindern: Eine metatypologische Mehrstichproben-KFA [Different inductive and spatial representation promotion types in learning disabled children: A multisample CFA]. Psychologie in Erziehung und Unterricht, 41, 14–21. Raven, J. C. (1973). The coloured progressive matrices (CPM) (11th ed.). London: Lewis. Reinartz, A., & Reinartz, E. (1979). Visuelle Wahrnehmungsfo¨rderung [Training of visual perception] (2nd ed.). Hannover: Schroedel. Sauter, F. C. (1979). Pru¨fung optischer Differenzierungsleistungen [Test of visual discrimination performance]. Braunschweig: Westermann. Schmude, C., & Sydow, H. (1994). Wirkungen kognitiven Trainings im Vorschulalter [Effects of a cognitive training in preschool children]. In K. Pawlik (Ed.), 39. Kongreß der Deutschen Gesellschaft fu¨r Psychologie, 25–29 September 1994 in Hamburg (Abstracts, Vol. 2, p. 624). Hamburg: Universita¨t Hamburg. Schneider, W., Vise´, M., Reimers, P., & Blaesser, B. (1994). Auswirkungen eines Trainings der sprachlichen Bewußtheit auf den Schriftspracherwerb in der Schule [Effects of a phonological awareness training program on the acquisition of literacy in school]. Zeitschrift fu¨r Pa¨dagogische Psychologie, 8, 177–188. Scriven, M. (1967). The methodology of evaluation. AERA Monograph Series on Curriculum Evaluation, 1, 39–83. Scriven, M. (1991). Evaluation thesaurus (4th ed.). Newbury Park: Sage. Shadish, W. R., & Sweeney, R. B. (1991). Mediators and moderators in meta-analysis: There’s a reason we don’t let dodo birds tell us which psychotherapies should have prizes. Journal of Consulting and Clinical Psychology, 59, 883–893. Sternberg, R. J. (1983). Criteria for intellectual skills training. Educational Researcher, 13(2, 6–12. Sternberg, R. J., & Bhana, K. (1986). Synthesis of research on the effectiveness of intellectual skills training: Snake-oil remedies or miracle cures? Educational Leadership, 44(2, 60–67. Sydow, H., & Meincke, J. (1994). DenkMit. Das Berliner Programm zur Fo¨rderung des Denkens und der Wahrnehmung bei drei- bis sechsja¨hrigen Kindern [DenkMit - A programme for fostering cognitive development in preschool children] (Manual). Kirchdorf: ZAK. Thorndike, R. L., & Hagen, E. (1971). Cognitive Abilities Test. Boston, MA: Houghton-Mifflin. Tomic, W. (1995). Training in inductive reasoning and problem solving. Contemporary Educational Psychology, 20, 483–490. Tomic, W., & Klauer, K. J. (1996). On the effects of training inductive reasoning: How far does it transfer and how long do the effects persist? European Journal of Psychology of Education, 11, 283–299. Wechsler, D. (1967). Wechsler Preschool and Primary Scale of Intelligence WPPSI. New York: Psychological Corporation. Weiß, R. H. (1974). Grundintelligenztest CFT 2, Skala 2 (2nd ed.). Braunschweig: Westermann. Weiß, R. H., & Osterland, J. (1980). Grundintelligenztest CFT 1, Skala 1 (4th ed.). Braunschweig: Westermann.

438

W. HAGER, M. HASSELHORN

Willett, J. B. (1990). Measuring change: The difference score and beyond. In H. J. Walberg & G. D. Haertel (Eds.), The international encyclopedia of educational evaluation (pp. 632-637). London, UK: Pergamon. Woodhead, M. (1988). When psychology informs public policy. The case of early childhood intervention. American Psychologist, 43, 443–454.