Intelligence 41 (2013) 289–305
Contents lists available at SciVerse ScienceDirect
Intelligence
Students' complex problem-solving abilities: Their structure and relations to reasoning ability and educational success Philipp Sonnleitner a,⁎, Ulrich Keller a, Romain Martin a, Martin Brunner b a University of Luxembourg, Centre for Educational Measurement and Applied Cognitive Science (EMACS), Campus Kirchberg, 6, rue Richard Coudenhove-Kalergi, 1359 Luxembourg, Luxembourg b Free University of Berlin, ISQ, Berlin-Brandenburg Institute for School Quality Improvement, Otto-von-Simson-Str. 15, 14195 Berlin, Germany
a r t i c l e
i n f o
Article history: Received 21 October 2012 Received in revised form 17 April 2013 Accepted 9 May 2013 Available online xxxx Keywords: Complex problem solving Reasoning Educational success Genetics Lab Microworlds
a b s t r a c t Complex Problem Solving (CPS) is considered to be a promising candidate for capturing higher order thinking skills that are emphasized in new educational curricula but are not adequately measured by traditional intelligence tests. However, little is known about its psychometric structure and its exact relation to intelligence and educational success—especially in student populations. This study is among the first to use a large and representative sample of secondary school students (N = 563) to examine different measurement models of CPS—that conceptualize the construct as either faceted or hierarchical—and their implications for the construct's validity. Results showed that no matter which way it was conceptualized, CPS was substantially related to reasoning and to different indicators of educational success. Controlling for reasoning within a joint hierarchical measurement model, however, revealed that the impressive external validity was largely attributable to the variance that CPS shares with reasoning, suggesting that CPS has only negligible incremental validity over and above traditional intelligence scales. On the basis of these results, the value of assessing CPS within the educational context is discussed. © 2013 Elsevier Inc. All rights reserved.
1. Introduction Intelligence tests were originally developed for educational settings (Binet & Simon, 1905; Deary, Strand, Smith, & Fernandes, 2007) to predict whether a student would succeed in mastering academic subjects or not (Mayer, 2000). Therefore, virtually all intelligence tests include subtests to capture students' abilities to solve problems and to reason. Although intelligence tests have not changed much since their invention more than 100 years ago (Hunt, 2011; Sternberg & Kaufman, 1996; Sternberg, Lautrey, & Lubart, 2003), they still fulfill this purpose quite well (Deary, 2012; Hunt, 2011; Kaufman, Reynolds, Liu, Kaufman, & McGrew, 2012; Naglieri & Bornstein, 2003).
⁎ Corresponding author. Tel.: +352 466644 9514. E-mail addresses:
[email protected] (P. Sonnleitner),
[email protected] (U. Keller),
[email protected] (R. Martin),
[email protected] (M. Brunner). 0160-2896/$ – see front matter © 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.intell.2013.05.002
In recent years, however, educational systems have been in a transition caused mainly by dramatic innovations in information technology (IT). This has produced a significant change in the student population; thus, today's students are described as “digital natives” (Prensky, 2001) or the “net generation” (Tapscott, 1998). Moreover, new educational goals that focus on students' problem-solving abilities have been set (Bennett, Jenkins, Persky, & Weiss, 2003; Ridgway & McCusker, 2003). The integration of such higher order thinking skills (Kuhn, 2009) in educational curricula is necessary to prepare students to solve the complex problems of today's world. Consequently, large-scale assessments such as the Program for International Student Assessment (PISA) have extended their evaluation scheme of key outcomes of the educational system to include problem-solving abilities (see Leutner, Fleischer, Wirth, Greiff, & Funke, 2012; Wirth & Klieme, 2003). Given these changes, we must ask whether traditional intelligence tests (which include subtests that assess students' abilities to reason and solve problems) still capture the cognitive skills that are vital for success in education or whether the
290
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
assessment of new constructs would be more beneficial. According to many authors, a potential alternative in this context is the construct complex problem solving (CPS; Funke, 2010). CPS is measured with computer-based problem-solving scenarios (also known as microworlds), which receive high acceptance among today's students (Ridgway & McCusker, 2003; Sonnleitner et al., 2012). These microworlds provide detailed information about students' problem-solving behaviors and strategies when they address complex and dynamic problems (Fischer, Greiff, & Funke, 2011; Funke, 2001). The results obtained with CPS measures are promising: not only that CPS has been found to be substantially correlated with intelligence (thus indicating that CPS captures central cognitive abilities that are similar to those captured by intelligence tests); recently, studies have also reported that CPS explains individual differences in external criteria over and above what is accounted for by intelligence. Measures of CPS were found to possess incremental validity beyond intelligence tests in predicting supervisor ratings (Danner, Hagemann, Schankin, Hager, & Funke, 2011) and— crucial to the educational context— grade point average (Wüstenberg, Greiff, & Funke, 2012). In sum, there is promising evidence that CPS may be a reliable and valid representation of higher order thinking skills and problem-solving behavior. However, most previous research on CPS has been based on (highly) selected samples of university students (e.g., psychology students). Empirical studies based on samples of secondary school students are still rare. Nevertheless, insights obtained for this population are vital for evaluating whether tests that measure CPS can potentially be used to assess students' higher order thinking skills because these skills are required by today's educational curricula. The major goal of the present paper was to significantly contribute to the knowledge about CPS by using a large heterogeneous sample of secondary school students. In doing so, we analyzed (a) the structure of CPS (as reflected by corresponding measurement models), (b) its relation to reasoning, (c) its ability to predict success in school (external validity), and (d) its ability to predict students' educational success over and above traditional intelligence tests (incremental validity). To this end, we put special emphasis on the (joint) hierarchical structure of intelligence and CPS. This allowed us to disentangle the individual differences in students' higher order thinking skills that are shared between traditional intelligence tests and measures of CPS from the individual differences that are unique to each of these measures. In doing so, the present results provide important insights into whether CPS is able to provide information that can be used to predict students' educational success over and above intelligence or not. 2. The relations between CPS and intelligence on latent and manifest levels Cognitive abilities are not directly observable; rather, they are latent constructs. One core idea that is used in the assessment process is that these latent constructs are considered to be distinct from their manifest measures. This distinction emphasizes the critical importance of the measurement model, which links latent variables to their corresponding measures (Bollen & Lennox, 1991; Borsboom,
Mellenbergh, & Van Heerden, 2003; Brunner, Nagy, & Wilhelm, 2012; Edwards & Bagozzi, 2000). Crucially, it has been shown that the choice of the measurement model may have severe consequences and even lead to different results when the relation between constructs is under investigation (Brunner, 2008; Hornung, Brunner, Reuter, & Martin, 2011). Thus, when the relation between CPS and intelligence is under investigation, theoretical considerations as well as former empirical results are vital for ensuring that well-grounded measurement models of both constructs are designed to represent their structures.
2.1. The structure of complex problem solving Up to now, there has been no widely accepted definition of the latent construct of CPS (Fischer et al., 2011; Frensch & Funke, 1995; Quesada, Kintsch, & Gomez, 2005). However, most definitions encompass the ability to overcome barriers in order to achieve a target state within a complex and dynamically changing environment (e.g., Buchner, 1995; Fischer et al., 2011; Frensch & Funke, 1995; Mayer & Wittrock, 1996). However, greater consensus exists with regard to the measurement of CPS by means of computerbased scenarios, so-called microworlds that should mirror complex problems. Microworlds that are used in CPS research were originally intended to overcome the limitations of traditional intelligence tests (Funke, 1993) and are described by several key characteristics: They (a) consist of several variables that (b) are highly interconnected and (c) change over time (i.e., are dynamic). Crucially, (d) these underlying connections are not transparent, and (e) the test taker has to achieve several partly contradictory goals (Funke, 2001, 2003, 2010). Notably, some of these characteristics are not shared with intelligence tests (e.g., on intelligence tests, tasks do not change over time or do not require the achievement of multiple goals). Typically, test takers interact with microworlds in two phases. In the first phase, they manipulate the microworld's variables in order to acquire knowledge. This knowledge must then be applied in a second phase in order to achieve several goals. An example of a contemporary microworld is the Genetics Lab (GL; Sonnleitner et al., 2012) shown in Fig. 1. In the GL, three scores are obtained, reflecting the test taker's ability to (a) retrieve information about the problem by applying an appropriate exploration strategy, (b) build a correct mental model of the problem, and (c) apply the gathered knowledge to achieve certain problem states. These abilities are described as central facets of CPS (and also show large overlap with the abilities that are tested by typical tests of reasoning ability; see also Fischer et al., 2011; Wüstenberg et al., 2012). Studies investigating the structure of CPS have provided mixed results. Kröner, Plass, and Leutner (2005) were among the first to report three different facets of CPS that could be empirically distinguished. These facets corresponded to the typically obtained scores in such scenarios (see above) and were described as rule identification, referring to the quality of the applied exploration strategy, rule knowledge, and rule application. The study reported by Kröner et al. (2005) exclusively included students from high school (German
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
“Gymnasium”), and its sample size was relatively small (n = 101). However, Kröner et al.'s (2005) approach was criticized because the obtained scores were based on the test taker's interactions with only one problem scenario, causing several psychometric problems such as rendering the obtained scores heavily dependent on each other (cf. Greiff, 2012). Thus, to overcome the limitations of the one-scenario approach, Greiff, Wüstenberg, and Funke (2012) administered a microworld consisting of several independent problem scenarios in order to study the latent structure of CPS with a more reliable assessment instrument. But despite the different measurement approach, the same picture emerged. The three facets of CPS could be distinguished, again suggesting a three-dimensional model of CPS. In another study by Wüstenberg et al. (2012), the same assessment approach of several independent problem scenarios was applied; however, in this study, the authors came to different conclusions. According to their results, the facet of rule identification was not empirically distinguishable from rule knowledge. Thus, they favored a two-dimensional model of CPS including only rule knowledge and rule application over a three-dimensional model that also included rule identification. In contrast to Kröner et al. (2005), the studies reported by Greiff et al. (2012) and Wüstenberg et al. (2012) employed samples of (highly selected) university students. Although they (i.e., Greiff et al. and Wüstenberg et al.) tested samples that were similar and applied the same assessment approach involving multiple scenarios, the problem scenarios they used differed in terms of input mode (numerical compared to graphical), the number of variables involved, and the number of variables that changed over time. They also applied a different scoring procedure for rule identification. Importantly, in all these studies, a one-dimensional model was not supported, even though the facets were always found to be highly correlated. This is especially interesting because Abele et al. (2012), also relying on the multiple-scenario approach, reported acceptable model fit for a one-dimensional model of CPS, suggesting only a general factor underlying all three performance scales. Abele et al. tested a sample of exclusively male trainees of technical jobs with a mean age of 20.1 years (SD = 2.1). Almost the entire sample (96%) reported secondary school as their highest level of education. In sum, the inconclusive findings on the structure of CPS may be due to a complex interaction of the composition of different and highly selective samples, the assessment instruments used, and the scoring of the performance indicators. Specifically, it seems that three alternative structural conceptualizations of CPS have been supported by previous findings. (a) Indicators of CPS are typically found to be substantively intercorrelated, which may reflect the operation of a general factor of CPS. Drawing on Abele et al.'s (2012) results, Model A (Fig. 2A) can be used to test this hypothesis by representing CPS as a single construct. (b) Model B (Fig. 2B), by contrast, can be used to investigate the notion of CPS as a multidimensional construct that includes three facets, namely rule identification, rule knowledge, and rule application (Greiff et al., 2012; Kröner et al., 2005). Model B can be used to help to determine whether a multifaceted account of CPS can be supported and if yes, how many of these facets are empirically distinguishable. (c) Model C
291
(Fig. 2C) combines the theoretical positions of Models A and B within a higher order CPS factor model. The assumption underlying Model C is that the high intercorrelations between facets of CPS from previous studies can be accounted for by the operation of a higher order construct that may be interpreted as students' general ability to deal with complex problems. 2.2. The structure of intelligence and its relation to CPS Although countless definitions of (the latent construct) intelligence exist, 52 experts agreed on a definition that was intended to represent mainstream thinking about intelligence and was presented in a very influential editorial (Gottfredson, 1997). The definition stated: “Intelligence is a very general mental capability that, among other things, involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly and learn from experience. Further, intelligence […] reflects a broader and deeper capability for comprehending our surroundings—‘catching on,’ ‘making sense’ of things, or ‘figuring out’ what to do” (Gottfredson, 1997, p. 13). Even though in the same editorial it was agreed that traditional intelligence tests are adequate manifest measures of these latent abilities, throughout the history of intelligence research, various measurement models have been suggested to best represent the underlying construct (Schulze, 2005). However, since the impressive factor-analytic studies by Carroll (1993), the view of a hierarchical conceptualization of cognitive abilities—and hence, intelligence—has dominated (Deary, 2012; Hunt, 2011; Schulze, 2005). This finally led to the formulation of the widely accepted Cattell–Horn–Carroll (CHC) model of intelligence (McGrew, 2005, 2009), suggesting a hierarchical structure of intelligence with three levels: a general “g” factor at the top, broad ability facets on the second level (e.g., fluid reasoning or short-term memory), and narrow subfacets on the first stratum (e.g., quantitative reasoning or memory span). Recently, the hierarchical structure of intelligence has become even more pronounced by the g-VPR model (Johnson & Bouchard, 2005), which proposes a four-level structure of cognitive abilities by rearranging some of the broader abilities of the CHC model, but again putting a general g factor at the top of the hierarchy. The notion of g has also been strongly supported by Johnson and colleagues' (Johnson, Bouchard, Krueger, McGue, & Gottesman, 2004) research, which showed that the general factors resulting from different intelligence test batteries are in fact identical. In short, a hierarchical conceptualization of intelligence that includes a broad g factor at the top seems well justified. However, the interpretation of the resulting higher order factors in hierarchical models of intelligence is somewhat problematic with regard to content (Gustafsson & AbergBengtsson, 2010; Hunt, 2011; Schulze, 2005). Thus, for some research questions and especially for practitioners, it might nevertheless be useful to focus on more specific facets of intelligence. Related conceptualizations omit the broad g factor and explain the variance in observed variables by first-order factors only. As these facets are directly linked to a subset of manifest measures, their interpretation is much more straightforward and easier than in hierarchical models
292
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
of intelligence (Gustafsson & Aberg-Bengtsson, 2010). Note that this conceptualization of cognitive abilities has also been advocated in recent versions of the theory of fluid and crystallized abilities (Horn & Noll, 1997). In sum, intelligence
can be conceptualized as either a hierarchical construct with a broad g factor at the top of the hierarchy or as a faceted construct. The choice of measurement model depends heavily on the research purpose and on which level of the
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
293
Fig. 1. Screenshots of the different tasks students have to solve within the Genetics Lab, a microworld used to assess complex problem solving (Figures taken from Sonnleitner et al., 2012).a. Task 1—Exploring the creature: First, students investigate the effects of certain genes on a creature's characteristics. Genes and thus their effects can be switched “on” or “off.” By clicking the “next day” button at the top of the screen, students can move forward in time and then observe the effects of their manipulations by studying the related diagrams (genes are depicted in red diagrams; characteristics are depicted in green diagrams).b. Task 2—Documenting the knowledge: While exploring the creature, students document their gathered knowledge about the genes' effects in a related database that shows the same genes and characteristics as the lab. They do this by drawing a causal diagram indicating the strength and direction of the discovered effects. The resulting model can be interpreted as the student's mental model of the causal relations.c. Task 3—Achieving target values: Finally, students have to apply the knowledge they gathered in order to achieve certain target values for the creature's characteristics. Crucially, they have only three “days” or time steps left to do this. Thus, students have to consider the dynamics of the problem and to plan their actions in advance.
hierarchy the research focus should be set (Brunner et al., 2012; Hunt, 2011).
2.2.1. The relation between intelligence and CPS The definitions of intelligence and CPS given above leave no doubt that there is conceptual overlap between these two constructs. For example, both definitions highlight higher order thinking skills. Indeed, according to 52 experts in intelligence research (Gottfredson, 1997), intelligence may even be considered to be an umbrella term that includes CPS as a more specific facet. Thus, from this perspective, CPS and reasoning ability as measured by traditional intelligence tests would capture the same latent construct—intelligence. Given that intelligence and CPS can both be conceptualized as faceted or as hierarchical constructs, their relation can be studied by means of four different measurement models (depicted in Fig. 3): (a) CPS and intelligence are both conceived as (distinct) faceted constructs (Fig. 3, Model D), (b) CPS is conceptualized as a hierarchical construct, whereas intelligence is conceived as a faceted one (Fig. 3, Model E), (c) the hierarchy of intelligence is highlighted, whereas CPS is seen as a faceted construct (Fig. 3, Model F), and (d) both constructs are seen as hierarchical (Fig. 3, Model G). In the following section, however, we will show that there has been a severe imbalance
in the use of these different measurement models in previous research. When CPS research first began, because CPS and intelligence as measured by traditional intelligence tests were viewed as distinct (e.g., Dörner, 1986), their relation was usually studied by first drawing from the measurement model, which conceived of the two constructs as distinct facets (Fig. 3, Model D). This notion was based mainly on the conceptual differences between the static multiple-choice items of traditional intelligence tests and microworlds that include dynamic elements that change due to the actions of the test taker (see above). Probably as a consequence of this theoretical position, previous research on the relation between intelligence and CPS almost exclusively conceived of the two constructs as distinct first-order factors and therefore focused on the relations between facets of intelligence, especially reasoning, and specific facets of CPS. However, this line of research produced overwhelming empirical evidence that (the facets of) the two constructs were significantly related. Substantial correlations have been consistently found between reasoning and microworld control performance, which is seen as a manifest measure of the CPS facet rule application. Despite these strong correlations, a substantial amount of variance in control performance was left unexplained. The same result was found whether the applied microworld was semantically rich (i.e., intended to mirror a
294
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
A) General factor model of CPS
CPS
.55
.67
RI1
RI2
.54
.78
.89
.60
.53
.78
.46
RI3
RK1 RK2 RK3 RA1 RA2 RA3
.03
.41 .23
B) Faceted CPS model
.63
.68
RI
.60
.92
RI1
.75
RI2
.78
RI3
.94
RK
.91
.60
RA
.55
.84
.50
RK1 RK2 RK3 RA1 RA2 RA3
.20
.41 .34
C) Hierarchical CPS model CPS 1.00
.68
RI
RK R2 = .46
.60
RI1
.92
.75
RI2
.94
R2 = .88
R2 = 1.00 .78
RI3
RA
RK1
.91
.60
.55
RK2 RK3
.20
.84
.50
RA1 RA2 RA3
.41 .33
Fig. 2. Models representing the structure of Complex Problem Solving as measured by the GL. Note. CPS = Complex Problem Solving; RI = Rule Identification; RK = Rule Knowledge; RA = Rule Application; RI1–RI3: parcel scores of Rule identification items; RK1–RK3: parcel scores of Rule Knowledge items; RA1–RA3: parcel scores of Rule Application items. Standardized model solution is shown.
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
D) Faceted intelligence – faceted CPS
295
E) Faceted intelligence –
.59
hierarchical CPS .63
.56
.62
Reasoning
RI
RK
.38
.59
.50
MA
.68
.69
SF
.60
NC
.92
.79
.91
.55
.61
RK1 RK2 RK3
RI3
.83
.51
RA1 RA2 RA3
.18
F) Hierarchical intelligence – faceted CPS .54 RIspecific
.52
MA
SF
.60
NC
RI
.90
.85
RK
RA
hierarchical CPS RAspecific g
.88
.59 .79 .39
.48
.68
G) Hierarchical intelligence –
RKspecific .61
.58
.38
.31
g
CPS
.94
.75
RI2
RI1
Reasoning
RA
CPSspecific
.38 .69 .29 .47 .73 .65
.90 .67 .58
.55
MA
.52
SF
.68
NC
.35
.31
RI1
.31
RI2
.52
RI3
.49
.47
.40
RK1 RK2 RK3
.20 .33
.49
.52
.68
.37
.56
.54
.44
RA1 RA2 RA3
MA SF NC
RI
RK RA
.40
Fig. 3. Models representing the relations between intelligence and Complex Problem Solving. Note. CPS = Complex Problem Solving; RI = Rule Identification; SK = System Knowledge; CP = Control Performance; MA = Matrices sum score; SF = Selecting figures sum score; NC = Number completion sum score; RI1–RI3: parcel scores of Rule Identification items; RK1–RK3: parcel scores of Rule Knowledge items; RA1–RA3: parcel scores of Rule Application items; RIspecific, RKspecific, RAspecific, CPSspecific = specific variance of RI, RK, RA, and CPS when controlling for the variance shared with reasoning. Standardized model solution is shown.
real-world scenario; Danner et al., 2011; Gonzalez, Thomas, & Vanyukov, 2005; Kersting, 2001; Rigas, Carling, & Brehmer, 2002; Süß, 1996; Wagener & Wittmann, 2002; Wittmann & Hattrup, 2004) or semantically poor (Kluge, 2008; Kröner et al., 2005). Kröner et al. (2005) were the first to report that reasoning was significantly related to all three central facets of CPS (see above). (Latent) correlations ranged from .41 (rule identification) to .63 (rule application). However, Kröner et al. relied on data that resulted from the test taker's interaction with only one problem-solving scenario, which caused several psychometric problems. Moreover, the way they assessed rule knowledge and rule application—with static multiple-choice items—strongly resembled traditional intelligence test items, perhaps leading to an overestimation of the association between reasoning and facets of CPS (Wüstenberg et al., 2012). However, studies that have used a multiple-scenario approach and that have measured performance with established indicators of CPS (causal diagrams representing the test taker's rule knowledge and deviations from the target values as a marker of rule application) have confirmed the strong relation between reasoning and the facet rule knowledge (Wüstenberg et al., 2012) or even all facets of CPS (Abele et al., 2012; Sonnleitner et al., 2012).
Compared to the large number of studies that have conceived of CPS as a faceted construct, only a small number of studies have highlighted the hierarchy of CPS and have investigated the relation between a general CPS ability (operationalized as a composite score of all subscales of the administered microworld) and facets of intelligence (Fig. 3, Model E). However, not surprisingly, the result has been the same: Reasoning ability has shown a strong association with the general CPS factor, ranging from .40 (Abele et al., 2012) to .67 (Kröner et al., 2005). Crucially, there are no studies that have explicitly considered the hierarchy of intelligence in the applied measurement model when addressing the relation between intelligence and CPS. This is especially interesting because meanwhile, the conceptual overlap between intelligence and CPS has also been widely acknowledged in the CPS community (Fischer et al., 2011; Greiff et al., 2012; Wenke, Frensch, & Funke, 2005; Wüstenberg et al., 2012). Wüstenberg et al. (2012), for example, even highlighted similarities in task characteristics between the microworld they used to assess CPS and the Advanced Progressive Matrices (Raven, 1958), which is conceived of as a—if not the—central measure of general intelligence (Carpenter, Just, & Shell, 1990). However, the
296
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
amount of conceptual overlap has been debated and is often limited to certain cognitive processes that are tapped by traditional intelligence tests and that serve merely as a basis for complex cognition or the higher order thinking skills measured by microworlds (Fischer et al., 2011; Funke, 2001; Greiff et al., 2012; Wüstenberg et al., 2012). Intuitively, this theoretical position would be best represented by a joint higher order factor model that posits a general cognitive ability factor at the top that influences a lower level of the hierarchy, and this lower level contains facets of intelligence as well as facets of CPS. For the purpose of the present paper, however, we specified two nested-factor models (Fig. 3, Models F and G) that still consider the hierarchy of cognitive abilities but at the same time overcome some shortcomings that are associated with using higher order factor models (cf. Brunner et al., 2012; Gustafsson & Aberg-Bengtsson, 2010). First, if a nested-factor model is applied, the substantive interpretation of the general factor at the top of the cognitive hierarchy is facilitated because the general factor can then explicitly be defined (see also Eid, Lischetzke, Nussbeck, & Trierweiler, 2003). Reasoning ability lies at the center of all important theories about intelligence (Carpenter et al., 1990; Carroll, 1993; Gottfredson, 1997; Gustafsson, 1988; Wilhelm, 2005), and reasoning scales can be viewed as the standard method for operationalizing intelligence. Thus, to define the general factor in Models F and G, we used reasoning tests as indicators of general intelligence. Second, an intrinsic psychometric property of any higher order factor model is the proportionality constraint. This constraint affects the proportion of variance in the manifest indicators explained by the general and specific factors (Schmiedek & Li, 2004). Specifically, the ratio of variance that is attributable to the general factor to the variance that is attributable to the specific CPS-related constructs is held constant across domain-specific subscales by the proportionality constraint. Importantly, the proportionality constraint would cause the parameter estimates of the relations between external criteria and the general and specific factors to be linearly dependent (Chen, West, & Sousa, 2006; Schmiedek & Li, 2004). Using a nested-factor model instead allows for the simultaneous investigation of the general factor's (external) validity and the specific facets' (incremental) validity when studying their relations to external criteria without such linear dependencies. Third, in a higher order factor model, it is not possible to freely estimate all correlations to external criteria from the general factor and all specific factors because the model is not identified in this case. Thus, to obtain reliable model parameters, one of these correlations must be fixed to zero or any other value (cf. Brunner et al., 2012; Schmiedek & Li, 2004). Crucially, when using a nested-factor model, estimates of all correlations that relate general and specific factors to external criteria can be obtained without including such a model constraint. In sum, to accomplish our reseach goals—particularly the analysis of the external and incremental validity of CPS to indicators of academic achievement—we preferred the nestedfactor model over the higher order factor model to set up Models F and G. Model F (Fig. 3F) can be used to investigate the impact of this general intelligence factor while simultaneously conceiving of CPS as a faceted construct. Thus, we were able to gain further insight into the amount of variance of the CPS scales that is not explained by intelligence and whether this
variance can be attributed to the facets of CPS. Moreover, if the relations between these facets were found to mirror the ones in Model B, there would be strong evidence to support a faceted conception of CPS that is independent of intelligence. Model G (Fig. 3G), by contrast, is motivated by a hierarchical conceptualization of intelligence as well as a hierarchical conceptualization of CPS. This model allows for the investigation of whether the variance in CPS scales that is not explained by general intelligence can be attributed to a general CPS ability. Taken together, we conclude that (a) the question concerning the relation between intelligence and CPS has been mainly addressed from a theoretical point of view that conceives of the two constructs as distinct; (b) this one-sided faceted approach may have significantly impacted results (Brunner, 2008; Brunner et al., 2012; Hornung et al., 2011), and crucial information about the type and characteristics of this relation may have been missed; (c) current thinking about a hierarchical structure of cognitive abilities (Carroll, 1993; Deary, 2012; Hunt, 2011; Schulze, 2005) has been neglected; and (d) little is known about the relation between intelligence and CPS in samples of school students as most studies have included only samples of university students or adults. Studies including second-grade students by Kröner et al. (2005) and Sonnleitner et al. (2012) provide notable exceptions; however, the results were based on small and nonrepresentative samples. Consequently, for the following analyses, we advocated an even-handed approach and placed particular emphasis on alternative structures of CPS and intelligence, respectively. To this end, intelligence and CPS were conceptualized as faceted (and represented as firstorder factors) or as hierarchical constructs to provide a balanced perspective of the interplay between these constructs. In doing so, four different measurement models (see Fig. 3) were examined in a large and representative sample of second-grade students. 2.2.2. External and incremental validity of CPS over and above reasoning As stated above, some authors claim that microworlds are capable of measuring complex cognition and higher order thinking skills that are not tapped by traditional intelligence tests (Funke, 2010; Greiff et al., 2012; Wüstenberg et al., 2012). Because today's educational systems are in a transition with an increasing focus on students' higher order problem-solving abilities (Bennett et al., 2003; Kuhn, 2009; Ridgway & McCusker, 2003), microworlds are therefore considered to be attractive candidates for complementing or even replacing traditional intelligence tests in the prediction of educational success—the original purpose of intelligence testing (Binet & Simon, 1905; Deary et al., 2007). Moreover, compared to the negative image of intelligence tests in the educational field (e.g., Adey, Csapó, Demetriou, Hautamäki, & Shayer, 2007), microworlds also enjoy high acceptance among students (Ridgway & McCusker, 2003; Sonnleitner et al., 2012) and provide the opportunity to derive process measures that could be better used for interventions than the mere product measures of static intelligence tests. However, empirical evidence that demonstrates that the performance scores derived from microworlds are better predictors of external criteria than intelligence tests is still
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
scarce. Most studies have been conducted with (small) adult samples in occupational contexts (Abele et al., 2012; Danner et al., 2011; Kersting, 2001; Wagener & Wittmann, 2002). Whereas Wagener and Wittmann (2002) were able to show that CPS explained additional variance in performance in a job-related in-tray exercise and a case study, other studies have mainly concluded that general cognitive ability or reasoning as measured by traditional intelligence tests remains the better predictor of external criteria with negligible amounts of additional variance explained by performance in microworlds. Two studies investigated the external and incremental validity of CPS in university student samples. For the rule knowledge facet, Greiff et al. (2012) reported substantial correlations with grades from the (German) school leaving examination, and Wüstenberg et al. (2012) even demonstrated incremental validity over and above reasoning for grade point average although it explained less than 6% of additional variance in the criterion. Finally, a study that employed a sample of secondary school students was reported by Sonnleitner et al. (2012). They were able to show that performance indicators of CPS were substantially correlated (at the manifest level) with grades in mathematics (range: r = .35 to .39) and science (range: r = .16 to .30). However, they did not investigate whether this relation would hold when controlling for intelligence. In light of these findings, it is evident that (a) although CPS and its facets show external validity, there is little evidence to support its incremental validity over and above traditional measures of intelligence, and (b) this is particularly true for the educational context for which almost no data are available. Moreover, whereas Abele et al. (2012) investigated the external validity of only the general CPS factor, the other studies more or less focused on single facets of CPS when studying the relation between CPS and external criteria. The two approaches, however, might be differently suited to predict external criteria with different levels of generality. According to the specificity matching principle (e.g., Swann, Chang-Schneider, & MacCarthy, 2007), specific predictor variables (e.g., the rule identification score) should be used to predict specific outcome variables (e.g., mathematics grade), whereas general predictor variables (e.g., general CPS ability) should be matched with general outcome variables (e.g., educational success). In the domain of personality research, for example, it was shown that narrow facets of personality outperform the Big Five (Paunonen & Ashton, 2001) in the prediction of specific behavioral criteria. Consequently, given the fact that different models can be applied to capture the structure of CPS and intelligence, the question about the external validity of CPS again requires an even-handed strategy. To this end, we investigated the external validity of the specific CPS facets (Fig. 4, Model H) and general CPS ability (Fig. 4, Model I) with regard to specific criteria (e.g., grades, see below) and general criteria (e.g., educational success). Both models included the facet reasoning, viewing its relation to external criteria as the benchmark for the external validity of CPS. Likewise, Model J (see Fig. 4) was used to investigate the external validity of specific CPS facets while simultaneously controlling for a higher order general cognitive ability factor. This conceptualization of a nested factor model further allowed us to draw conclusions about the incremental validity of the specific CPS facets because the variance the facets shared with reasoning was bundled in the latent factor of general cognitive ability (see
297
Eid et al., 2003). Model K, by contrast (Fig. 4), was used to study the incremental validity of general CPS with respect to specific and general external criteria. 3. Method 3.1. Participants Participants were 563 Luxembourgish students in Grades 9 and 11 who were enrolled in two different school tracks (i.e., the intermediate and highest academic tracks). About half of them were enrolled in the ninth grade (n = 300; 146 females; M = 15.6 years, SD = 0.75; 112 were enrolled in the highest academic track), and 263 students were enrolled in the 11th grade (138 females; M = 17.4 years, SD = 0.75; 217 were enrolled in the highest academic track). 3.2. Procedure The participating schools volunteered to participate in the study in order to receive an evaluation of their science curriculum. The results of the study were presented in an aggregated form to each school. The study was conducted with approval from the national Ministry of Education and in accordance with the ethical standards of the host university; the requirements of the national data protection law were followed. Both students and their parents were informed in writing about the scientific background of the study well in advance and were given the opportunity to refuse to participate in the study. Trained research assistants administered the intelligence scales, a questionnaire, and the GL at school during regular class time. The total testing time was 110 min (i.e., two school lessons). To foster commitment, students were offered detailed written feedback on their performance after completion of the study and a prize for the two best students in each participating class. 3.3. Measures 3.3.1. Reasoning Reasoning ability was measured by three subtest scores that assessed students' ability (a) to complete figural matrix patterns (time limit: 10 min; score MA, Figs. 3 and 4), (b) to solve number series (10 min; score NS, Figs. 3 and 4), and (c) to mentally arrange geometric figures (7 min; score GF, Figs. 3 and 4). These subtests were taken from the Intelligence Structure Test IST-2000R (Amthauer, Brocke, Liepmann, & Beauducel, 2001), a reliable and valid measure of intelligence. To facilitate the interpretation of the results, all subtest scores were expressed as the percent of maximum possible score (POMP; see Cohen, Cohen, Aiken, & West, 1999), for which a value of 0 indicates the lowest possible score, and a value of 100 indicates the highest possible score (see Table 1 for reliability estimates and descriptives). 3.3.2. Complex Problem Solving As a measure of Complex Problem Solving, we administered the Genetics Lab (GL; see Fig. 1), a freely available, multilingual, computer-based, and psychometrically sound microworld that is well accepted among students (see http://
298
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
H) Faceted intelligence – faceted CPS
I) Faceted intelligence – hierarchical CPS
External criterion
External criterion
Reasoning
MA
SF
NC
RI
RI1
RI2 RI3
RK
RA
RK1 RK2 RK3
RA1 RA2 RA3
J) Hierarchical intelligence – faceted CPS
Reasoning
MA
SF
RIspecific
MA
SF
NC
RI1
RI2 RI3
RI
RK
RA
K) Hierarchical intelligence – hierarchical CPS External criterion
External criterion
g
NC
CPS
RKspecific
RAspecific
g
RK1 RK2 RK3
RA1 RA2 RA3
MA SF NC
CPSspecific
RI
RK RA
Fig. 4. Models representing the external and incremental validity of CPS. Note. CPS = Complex Problem Solving; RI = Rule Identification; RK = Rule Knowledge; RA = Rule Application; MA = Matrices sum score; SF = Selecting figures sum score; NC = Number completion sum score; RI1–RI3: parcel scores of Rule Identification items; RK1–RK3: parcel scores of Rule Knowledge items; RA1–RA3: parcel scores of Rule Application items; RIspecific, RKspecific, RAspecific, CPSspecific = specific variance of RI, RK, RA, and CPS when controlling for the variance shared with reasoning.
www.assessment.lu/GeneticsLab and Sonnleitner et al., 2012, for further information). Students were given standardized online instructions for how to use the GL and worked on practice scenarios. This instructional period lasted 15 min. Afterwards, students completed 12 GL scenarios of varying complexity within an overall time limit of 35 min, which allowed the vast majority of students (86%) to complete all scenarios. Performance across scenarios was summarized by three scores that reflected students' proficiency in the three main facets of Complex Problem Solving (the scoring algorithms can be found in Keller & Sonnleitner, 2012): (a) Each student's exploration strategy was scored on the basis of a detailed log-file in which every interaction with the microworld is stored. Thus, it was possible to derive a process-oriented measure (Rule Identification) indicating how efficiently a student explored a scenario by relating the number of informative exploration steps to the total number of steps applied (Kröner et al., 2005). Note that an exploration step is most informative if students manipulate the genes in a way that any changes in characteristics can be unambiguously
attributed to a certain gene (see Vollmeyer, Burns, & Holyoak, 1996). (b) Students' Rule Knowledge was assessed by scoring their database records (see Fig. 1B) by adapting an established scoring algorithm (see Funke, 1992, 1993). The resulting Rule Knowledge score thus reflects knowledge about how a gene affects a certain characteristic of a creature and knowledge about the strength of such an effect. (c) Finally, the actions that students took to achieve certain target values on the creature's characteristics during the control phase (see Fig. 1C) were used to compute a process-oriented Rule Application score. Only if a step was optimal in the sense that the difference from the target values was maximally decreased was the step considered to indicate good control performance. Given that all target values must be achieved within three steps, a maximum score of three was possible for each scenario. This approach overcomes the limitations of many previous scoring procedures by guaranteeing that the scoring of a certain control step would be completely independent of the preceding control steps. All three performance scores showed satisfactory reliabilities (see Table 1), ranging from Cronbach's α = .79 (Rule Application) to α = .91 (Rule Identification). The scores on each CPS scale were
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305 Table 1 Reliability estimates, means, standard deviations, and reliabilities for measures of reasoning and CPS. Measure Reasoning Matrices Number series Selecting figures Complex Problem Solving Rule identification Rule knowledge Rule application
α
M
SD Min. Max. p25 Mdn p75
.52 57 16 .91 71 25 .75 42 19
0 0 0
100 100 85
47 55 30
60 75 40
67 90 55
.91 28 15 .90 69 17 .79 54 19
01 37 17
71 100 100
17 55 39
26 67 50
39 81 67
Note. α = Cronbach's alpha; p25 = first quartile (Q1); p75 = third quartile (Q3).
expressed as POMP scores with a value of 0 indicating the lowest, and a value of 100 indicating the highest possible score. Scale descriptives are given in Table 1. To analyze the structure of CPS, we created parcel scores (i.e., sum scores of subsets of items) for each CPS facet (Rule Identification, Rule Knowledge, and Rule Application) in order to better capture these latent constructs. Compared to individual item scores, parcel scores are less prone to distributional violations and show higher reliability. To further increase the accuracy of parameter estimation, we built item parcels that captured the dimensional structure underlying the items and were thus more homogenous (see Hall, Snell, & Singer Foust, 1999). To this end, each CPS facet was analyzed separately using confirmatory factor analysis, and items that shared a theoretical and empirically supported secondary influence were combined into a parcel. This approach led to three item parcels per facet such that Parcel 1 contained items with only two input variables (Items 1, 2, and 3), Parcel 2 contained items with three input variables (Items 4, 5, 6, and 7), and Parcel 3 contained items with three input variables and variables that changed dynamically (Items 8, 9, 10, 11, and 12). Note that for analyses that conceptualized complex problem solving as a hierarchical construct and that focused on the apex of this hierarchy, we followed the aggregation strategy recommended by Bagozzi and Edwards (1998) to use the sum scores of each subscale as indicators of this general CPS ability. 3.3.3. Students' educational success We used several key indicators of students' educational success. 3.3.3.1. Grades. Students reported subject-specific grades that they received on their last report card in (a) mathematics, (b) science, (c) French, and (d) German. In the Luxembourgish school system, grades range from 0 to 60 with higher grades indicating better achievement. 3.3.3.2. Epreuves standardisée (ÉpStan). A large proportion of the students in Grade 9 (n = 285, i.e., 95%) also participated in the year 2011 cycle of ÉpStan, the Luxembourgish school monitoring program in which data were collected in November 2010 (i.e., 2 months before students completed the GL). For these students, data were available from computerbased competency tests of German reading comprehension, French reading comprehension, and mathematics. Note that these tests were developed on the basis of advice from
299
substantive and statistical expert panels and the results of extensive pilot studies. 3.3.3.3. PISA. Eighty-seven students (38 females; 70 were enrolled in the highest academic track, i.e., 81%) in Grade 11 also participated in the year 2009 cycle of PISA; for these students, data were available from standardized paper-pencil tests of German reading, mathematics, and science competency. Again, these tests were developed on the basis of advice from substantive and statistical expert panels and the results of extensive pilot studies. Although the number of students with available data from the PISA tests seems relatively small, this subsample of 11th graders is fairly representative of the total sample of 11th graders. Only minor differences were found in terms of the distributions across academic tracks, reasoning ability (Cohen's d b 0.11 on all scales), complex problem-solving ability (d b 0.12 on all scales), or grades (d b 0.23 for math, French, and science). Only with regard to German grades was a significant difference found (d = 0.32) with students in the subsample receiving worse grades. On average, the subsample for which PISA data were available was 4 months older (M = 17.6 years, SD = 0.27; d = 0.54) than the total sample. For the vast majority of students for whom PISA data were available, because almost no differences from other students could be found in terms of school track, cognitive ability, or school achievement, we did not expect any biases in our analyses that included those data. However, results concerning reading competency (PISA) were interpreted with greater caution. 3.4. Statistical analyses We embedded the statistical analyses in a structural equation modeling environment (a) to capitalize on recent psychometric advances in confirmatory factor analysis (e.g., Eid et al., 2003) and (b) to efficiently handle the observed missing data patterns on reasoning measures and on achievement tests. Model parameters were computed with Mplus 5.2 (Muthén & Muthén, 1998-2010) by means of the maximum likelihood estimator with robust standard errors (MLR). Analyses concerning external validity were based on the full sample and by using the full information maximum likelihood (FIML) estimation method to adjust for missing data in order to ensure high statistical power for the detection of even small effects. Note that results did not differ when only the subsamples were used for analysis for which data from PISA or the ÉpStan were available. All reported coefficients are based on standardized solutions. To evaluate model fit, we consulted several descriptive measures that are recommended in the literature. As the χ2 goodness-of-fit statistic is known to be highly sensitive to sample size (e.g., Hu & Bentler, 1995), however, we emphasized the Standardized Root Mean Square Residual (SRMR), the Comparative Fit Index (CFI), and gamma, which is based on the more popular Root Mean Square Error of Approximation (RMSEA). Compared to the RMSEA, gamma has the advantage of providing a more realistic test of model fit when the number of manifest variables is small (Fan & Sivo, 2007). SRMR values below .08, CFI values above .95, and gamma values above .95 are generally considered to indicate good model fit (Hu & Bentler, 1998). In addition to the global fit indices, we checked
300
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
each model for local misspecifications. Residual correlations above .10 were considered to be problematic (cf. McDonald, 2010).
hence the choice of these models should be guided by the research question with regard to whether the specific components or the global aspect of CPS are the major focus of the research.
4. Results 4.2. The relation between intelligence and CPS 4.1. The structure of Complex Problem Solving In line with previous findings, our results showed that the reasoning facet was strongly related to students' CPS. Model D (Fig. 3D), conceiving of both intelligence and CPS as faceted constructs, showed acceptable fit (see Table 2 for fit indices). Reasoning was related to specific components of CPS with correlations ranging from r = .38 between reasoning and rule identification to r = .59 between reasoning and rule application. In Model E, CPS was viewed from a hierarchical perspective. Model E (Fig. 3E) fit the data well and showed a substantial relation between CPS and reasoning ability (r = .62). Yet, despite this strong association, these correlations were clearly different from r = 1.0, suggesting that reasoning ability and CPS may represent distinct cognitive abilities. To account for the hierarchy of intelligence and to compare the amount of variance in CPS scales that is explained by a general cognitive ability factor to the amount that is attributable to specific CPS factors, we investigated two nested factor models conceiving of CPS as a faceted construct (Model F) or as a hierarchical construct (Model G). Both models showed satisfactory fit in terms of our criteria (see Table 2), and no local misspecifications were detected. Importantly, Model F demonstrates that (a) the specific CPS facets have an incremental impact on the corresponding microworld scales over and above g, and (b) the relation between the facets of CPS is only slightly influenced by the general cognitive ability factor. Although the correlations between the specific CPS facets in Model F are somewhat lower than those in Model B (Fig. 2B), their pattern is essentially the same, showing the strongest relation between RKspecific und RAspecific (r = .88). This finding shows that the faceted structure of CPS is independent of general cognitive ability and supports the previous finding that rule knowledge and rule application are two separate facets. The high intercorrelations between the facets, however, again point to a common factor of CPS that was investigated in Model G. Results suggest that the common variance in the CPS scales was not fully explained by general cognitive ability, but rather that a substantial proportion of the common variance in CPS scales can be accounted for by a specific CPS factor. Note that although the standardized factor loadings of the CPS scales on the general cognitive ability factor
To investigate the structure of Complex Problem Solving, we applied CFA to analyze several theoretically derived models. A general factor model that represented CPS as a single construct (Fig. 2, Model A) was not supported by our data (see Table 2 for fit indices). Rather, the evaluation of model fit showed that CPS could be conceived as a multidimensional construct (Fig. 2, Model B). Notably, Rule Identification (RI), Rule Knowledge (RK), and Rule Application (RA) were found to be substantially intercorrelated. Moreover, we found that those parcel scores that were based on scenarios in which students had to detect dynamics (i.e., changes in some variables over time without manipulation; RI3, RK3, RA3) shared variance over and above the latent target constructs. Although the correlation between RK and RA was very high (r = .94), further analyses following the restriction strategy recommended by Van der Sluis, Dolan, and Stoel (2005) showed that this correlation was statistically different from r = 1.0 (χ2 diff = 8.61 with df = 2). Thus, these results suggest that individual differences in RI, RK, and RA represent distinct facets of students' CPS behavior. Importantly, these facets shared a substantial amount of variance (as indicated by the strong intercorrelations), which was targeted in the higher order factor model (Fig. 2, Model C). Specifically, this model supported the idea that the substantial correlations between the more specific facets of CPS are indicative of the operation of a higher order construct that may be interpreted as students' general ability to address complex problems. Note that the relation between system knowledge and the general CPS construct was restricted to 1 in order to achieve full convergence for the model. We will elaborate on this more fully in the Discussion section. To conclude, results obtained from structural equation models showed that complex problem solving may be structurally conceptualized either as a construct with three interrelated specific components (reflecting rule identification, rule knowledge, and rule application) or as a hierarchical construct for which (general) complex problem solving is at the apex and the more specific components of complex problem solving are located at the next level down in the hierarchy. Our results supported both conceptualizations, and
Table 2 Fit statistics of different measurement models representing the structure of cps and its relation to reasoning. Model
χ2
A. General factor model of CPS B. Faceted CPS model C. Hierarchical CPS model
Models representing the structure of Complex Problem Solving 404 24 b.001 .84 .07 118 21 b.001 .96 .04 117 22 b.001 .96 .04
.87 .96 .96
D. Faceted reasoning—faceted CPS model E. Faceted reasoning—hierarchical CPS model F. Hierarchical intelligence—faceted CPS model (nested factor) G. Hierarchical intelligence—hierarchical CPS model (nested factor)
Models representing the relations between CPS and Reasoning 168 45 b.001 .95 .04 13 8 .09 .99 .02 142 39 b.001 .96 .03 13 6 .04 .99 .02
.97 .99 .97 .99
df
p
CFI
SRMR
gamma
Note. χ2 = chi-square goodness-of-fit statistic; df = degrees of freedom; CFI = comparative fit index; SRMR = standardized root mean square residual.
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
were somewhat smaller (range: λ = .37 to λ = .56) than those obtained for the loadings on the specific CPS factor (range: λ = .47 to λ = .73), there was an overlap of the related confidence intervals at α = .05. This indicates that general cognitive ability probably influences performance on the CPS scales to about the same extent as specific CPS ability does. 4.3. External and incremental validity of CPS As discussed in Section 3.2, our data supported a faceted conceptualization of CPS as well as a hierarchical conceptualization with a general CPS factor at the top. In line with the specificity matching principle, both models were used to study the external validity of CPS with regard to specific and general criteria (see Fig. 4, Models H and I). The relation between reasoning and the external criteria in these models served as an empirical benchmark from which the external validity of CPS could be evaluated. Our results showed that the specific facets of CPS (Table 3, Model H faceted-faceted) and general CPS (Table 3, Model I faceted-hierarchical) were substantially related to all indicators of academic success. Notably, performances on the ÉpStan and PISA achievement tests were assessed 2 months and almost 2 years before our study, respectively. Hence, these results empirically underscore the idea that individual differences in CPS possess postdictive validity as well as a considerable level of temporal stability. Importantly, further analyses showed that the associations between CPS and measures of educational success were largely attributable to the variance that CPS shares with reasoning. Whether CPS was conceptualized as a faceted or as a hierarchical construct (see Fig. 4, Models J and K), when a higher order
301
general cognitive ability factor was controlled for, correlation coefficients between related CPS factors and indicators of educational success dropped considerably (Table 3, Model J hierarchical-faceted and Model K hierarchical-hierarchical). Importantly, the pattern of results also points to the operation of a method effect attributable to the mode of test administration. When the indicators of academic success were paper-pencil based (which concerns the PISA tests and school grades, which are largely based on written examinations), the correlations between CPS and educational success approached zero. However, when the indicators of educational success were also computer-based (i.e., ÉpStan test scores), CPS demonstrated incremental validity (i.e., the CPS scores provided additional information over and above traditional intelligence tests in explaining these criteria). 5. Discussion Complex Problem Solving is a relatively new construct, typically measured by computer-based microworlds; it is considered to be an attractive candidate to complement or even replace traditional intelligence tests in the prediction of educational success. Among other reasons, CPS may provide a superior alternative because it is supposed to measure complex cognition or higher order thinking skills that are not tapped by intelligence tests but are crucial for succeeding in today's educational curricula. The aim of the present article was to significantly extend knowledge about CPS by investigating its (a) psychometric structure, (b) relation to reasoning, (c) external validity, and (d) incremental validity with respect to educational success. This study is among the first to examine these questions in a large and representative sample of
Table 3 External validity of reasoning and CPS via correlations to external criteria. Model H: faceted–faceted
Model J: hierarchical–faceted
Model I: faceted– hierarchical
Model K: hierarchical– hierarchical
RAspec.
Rea
CPS
g
Criterion
Rea
RI
RK
RA
g
RIspec.
RKspec.
CPSspec
Mathematics Grades EPSTAN PISA
.33 .64 .52
.16 .30 .30
.22 .52 .31
.30 .59 .40
.33 .66 .53
.04 .08 .11
.04 .17 .01
.11 .24 .08
.33 .65 .55
.25 .58 .40
.34 .66 .56
.05 .22 .06
French Grades EPSTAN
.12 .16
.09 .32
.07 .35
.03 .37
.10 .18
.06 .27
.02 .29
−.05 .32
.12 .15
.05 .39
.12 .15
−.03 .37
German/reading Grades .27 EPSTAN .48 PISA .37
.12 .22 .23
.26 .50 .30
.25 .52 .25
.28 .53 .38
.02 .02 .08
.11 .23 .09
.09 .24 .02
.27 .50 .37
.27 .56 .28
.27 .50 .39
.12 .31 .05
Science Grades PISA
.42 .54
.17 .31
.30 .39
.33 .41
.43 .54
.01 .11
.05 .10
.08 .08
.42 .55
.31 .43
.42 .55
.06 .11
Educational success Grades .50 EPSTAN .74 PISA .55
.22 .38 .31
.38 .68 .40
.40 .75 .42
.51 .77 .56
.03 .11 .10
.09 .28 .09
.10 .35 .09
.50 .74 .56
.39 .76 .43
.51 .75 .55
.10 .38 .13
Note. Correlations are based on the full sample (N = 563) using the FIML estimation procedure to account for missing data; correlations in bold are statistically significant (at p b .05; two-tailed testing), correlations in normal print are not statistically significant; Rea = Reasoning; CPS = Complex Problem Solving; RI = Rule Identification; RK = Rule Knowledge; RA = Rule Application; RIspec., RKspec., RAspec, CPSspec. = specific variance of RI, RK, RA, and CPS when controlling for the variance shared with reasoning.
302
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
secondary school students and by explicitly juxtaposing alternative structural conceptualizations of cognitive abilities. After discussing the results, we will close with a preliminary conclusion concerning the usefulness of CPS in the educational context.
5.1. The structure of CPS Previous studies on the latent structure of CPS have been inconclusive with regard to the number of facets or the existence of a general (higher order) CPS factor that influences these specific facets. Drawing on a modern assessment instrument of CPS (i.e., the GL; Sonnleitner et al., 2012), we were able to show that CPS can be conceptualized either as a faceted construct with three distinct facets, including rule identification, rule knowledge, and rule application, or as a hierarchically structured construct that includes a general CPS factor at the top of the hierarchy. Our data did not favor one of these measurement models over the other. In contrast to the study reported by Wüstenberg et al. (2012), although there were high intercorrelations between the three facets of CPS, the rule identification facet was empirically distinguishable from rule knowledge, leading to the conclusion that the three facets are distinct, as was also reported by Greiff et al. (2012). This finding however, is especially interesting because the formal characteristics of the GL (design, number of variables, and input format) are more similar to the microworld used by Wüstenberg et al. (2012), but the content characteristics (number of items, including dynamically changing variables) are more similar to the microworld used by Greiff et al. (2012). Crucially, this indicates that only the presence of dynamics within a microworld, which affords a different exploration strategy (in the case of the GL, switching all genes to off), leads to the identification of rule identification as a distinct facet of CPS. Note that the importance of this characteristic was also underscored by our finding that the item parcels of all three subscales that contained items with dynamic variables shared variance over and above the modeled facets. Thus, we strongly recommend that users include dynamics when assessing CPS as this leads to a more fine-grained measurement of students' problem-solving abilities. As noted above, the high intercorrelations between the facets could be explained by a (higher order) common factor indicating a general CPS ability. Interestingly, this general ability was found to be perfectly related to the facet of rule knowledge, suggesting that building up knowledge about the problem is central to CPS: It represents both the outcome of students' exploration processes and the building block for the successful application of this knowledge. This central role might provide an explanation for why only rule knowledge has been reported to show significant external (Greiff et al., 2012) and incremental validity (Wüstenberg et al., 2012) in some studies. To conclude, it is possible to reliably assess three distinct facets of CPS with secondary school students, but it seems as though an accumulation of knowledge about the problem is the best indicator of students' general ability to successfully address complex problems. As both conceptualizations are empirically justified, further research is needed to clarify whether a faceted or a hierarchical conception of CPS is theoretically more fruitful. This could be done by a more
thorough investigation of each facet that also considers the dynamic nature of the tasks. 5.2. The relation between intelligence and CPS We were able to largely replicate previous results that showed a substantial relation between the facet of reasoning and facets of CPS (Fig. 3, Model D) or general CPS (Fig. 3, Model E). Thus, we were able to show that these relations also hold in secondary school students and that reasoning ability plays a crucial role in the process of solving complex problems. However, the relations are far from perfect, supporting the claim of many authors that CPS is distinct from reasoning but may tap the same (basic) cognitive processes (Funke, 2010; Greiff et al., 2012; Wenke et al., 2005; Wüstenberg et al., 2012). To further investigate the characteristics of this relation, we explicitly considered the hierarchical conception of cognitive abilities. The results of two nested factor models (Fig. 3, Models F and G) suggest that the strong relation might be caused by a common or general ability factor that is situated at the top of this hierarchy—an idea that is strictly in line with current conceptualizations of intelligence (Carroll, 1993; Johnson & Bouchard, 2005; McGrew, 2009). Importantly, when we controlled for this general ability factor, we were able to show that a faceted structure of CPS (Fig. 3, Model F) as well as a general CPS factor (Fig. 3, Model G) could still be established. These findings further advocate the notion of CPS as a distinct construct but also suggest that it should be integrated with a hierarchical conceptualization of intelligence because CPS is strongly influenced by a general cognitive ability factor. However, at which strata of cognitive abilities CPS or its facets should be located is impossible to answer with this study and needs further investigation (see also the limitations below). According to Funke (2010), Greiff et al. (2012), Wenke et al. (2005), and Wüstenberg et al. (2012), however, it is justifiable to interpret the variance that is shared between CPS performance scales and reasoning scales as the (basic) cognitive skills that are tapped by both measurement instruments. Consequently, the remaining specific facets (as depicted in Models F and G) then mainly represent complex cognition or higher order thinking skills. Crucially, as the size of the general ability factor's influence on CPS performance scales was determined to be about the same as for the related specific CPS facet, problem solving can be thought of as consisting of equal parts of “basic” and higher order thinking skills. 5.3. External and incremental validity of CPS Empirical evidence that supports the construct of CPS as a valuable predictor of educational success is scarce but is needed to justify the use of microworlds in the educational field and as a viable alternative to traditional intelligence tests. Consequently, this study explored the external and incremental validity of CPS with regard to various criteria of educational success in secondary school students. Our results showed that, irrespective of the specificity of the external criterion, all three facets of CPS as well as general CPS ability showed impressive external validity concerning the domains of mathematics, reading comprehension, and science in comparable size to reasoning, which is perceived as one of the best predictors of educational success. Note that external criteria consisted of self-reported
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
(grades) as well as objective indicators (PISA, ÉpStan) of educational success that were assessed at the same time or even 2 months or 2 years before the CPS scores were obtained. However, when we controlled for the variance that CPS scales share with reasoning, nearly all correlations between the remaining specific CPS facets and the external criteria dropped to zero. In particular, the missing incremental validity for mathematics is surprising, as probably most teachers (and researchers in the field of education and psychology) believe that mathematical problem solving includes higher order thinking processes. Only for the (computer-based) ÉpStan scores was additional variance explained over and above the general ability factor, suggesting that the variance that is specific to CPS after controlling for the variance it shares with reasoning may not be interpreted as due to higher order thinking skills as suggested in the previous literature, but merely as an effect that is specific to the mode of test administration (i.e., computer-based). This result, however, may also be explained by an incorrect choice of external criteria, which (specific) CPS should be able to predict. As computer and ICT skills become increasingly important in a digital and globalized world, CPS may possess incremental validity for more computer-related competencies (e.g., programming). In addition, educational systems are in a transition right at this moment (Bennett et al., 2003; Ridgway & McCusker, 2003) and have just begun to incorporate higher order thinking skills and problem solving into their curricula (Kuhn, 2009; Leutner et al., 2012; Wirth & Klieme, 2003). In sum, it might simply be too early to make a final judgment about the external validity of CPS. However, given the impressive correlations that were found between the general ability factor and measures of two established large-scale school monitoring programs (PISA and ÉpStan), the assessment of general intelligence will probably remain essential also for future educational systems. 5.4. Limitations and outlook Investigating the construct of CPS within the full representation of the hierarchy of intelligence would undoubtedly have been advantageous. However, due to the practical limitations of testing at schools (testing time was limited to two school lessons), we decided to focus only on reasoning, which is often described as being situated at the center of theories of intelligence (Carroll, 1993; Gottfredson, 1997; Wilhelm, 2005). Moreover, our results show that even this narrow facet of intelligence is capable of explaining the impressive correlations found between CPS and external criteria. An even broader or more general factor of intelligence derived from a fully administered intelligence test battery might not have led to different conclusions. Nevertheless, such a more complete representation of intelligence would have given a better indication of how and where (i.e., at which strata) to include CPS in the hierarchy of cognitive abilities. In this regard, additional research is clearly needed. In the present study, we accounted for the hierarchical conceptualization of cognitive abilities by applying a nestedfactor model. Compared to the application of a higher order factor model, this approach provided several benefits such as (a) a clear and unambiguous substantive interpretation of the general factor, (b) an unrestricted estimation of the factor
303
variances and relations between factors and external criteria, and (c) the opportunity to estimate relations between external criteria and the general factor as well as all specific CPS facets at the same time. These characteristics of a nested-factor model were especially advantageous for investigating the external and incremental validity of CPS, a main goal of our study. A comparison of the nested-factor model and higher order factor model in this context might, however, be interesting for future studies in this field. Especially when investigating at which stratum CPS should be included in hierarchical models of intelligence, a higher order factor model would clearly be preferable. For a thorough discussion of the psychometric properties, limitations, and benefits of both measurement models, see Brunner et al. (2012). To shed further light on CPS as a latent construct and especially how the specific facets of CPS can be interpreted, the integration of two rather independent research strands would be beneficial. Thus, as elaborated in the excellent review on problem solving by Reed (in press), a promising avenue for future research would be to study how research on CPS is related to the large body of research on (simple) problem solving. On the basis of our results, we conclude that although there is strong evidence that supports the idea that CPS is a construct that is strongly associated with but still distinct from intelligence, the added value of this construct does not lie in its incremental validity when compared to traditional intelligence tests. Maybe there will be external criteria for which microworlds explain additional variance, but currently existing and well-researched indicators of educational success as used in our study were sufficiently predicted by old-fashioned paper-pencil-based intelligence tests with static items. However, the term “old-fashioned” indicates a potential benefit of microworlds because they have been shown to be well accepted among today's tech-savvy students (Ridgway & McCusker, 2003; Sonnleitner et al., 2012). Still, it needs to be demonstrated that intelligence tests are therefore “outdated.” Another advantage of microworlds lies in the performance scores they provide. Although they might not measure something different from reasoning scales, they measure it differently. As Wüstenberg et al. (2012) already pointed out, the underlying processes of microworlds and Raven's APM (Raven, 1958) are pretty much the same, but with microworlds, it is possible to capture problem-solving processes directly. In this, we see the large advantage of computer-based microworlds because they will enable the training and evaluation of specific problem-solving skills. But as such trainings can be conducted only in standardized settings and because intelligence demonstrates plasticity only at a young age, future research will have to focus on the investigation of CPS in the educational context. With this study, we have clearly found a good starting point.
Acknowledgments This work was funded by the National Research Fund Luxembourg (FNR/C08/LM/06). We thank all the students and teachers for participating in this study, Ricky François and Markus Scherer for developing important parts of the GL software code, Ingo Schandeler for drawing the creatures, and Jane Zagorski for her editorial support.
304
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305
References Abele, S., Greiff, S., Gschwendtner, T., Wüstenberg, S., Nickolaus, R., Nitzschke, A., et al. (2012). Dynamische Problemlösekompetenz [Dynamic problem solving competence]. Zeitschrift für Erziehungswissenschaft, 15, 363–391. Adey, P., Csapó, B., Demetriou, A., Hautamäki, J., & Shayer, M. (2007). Can we be intelligent about intelligence? Why education needs the concept of plastic general ability. Educational Research Review, 2, 75–97. Amthauer, R., Brocke, B., Liepmann, D., & Beauducel, A. (2001). IntelligenzStruktur-Test 2000 R [Intelligence Structure Test 2000 R]. Göttingen, Germany: Hogrefe. Bagozzi, R. P., & Edwards, J. R. (1998). A general approach for representing constructs in organizational research. Organizational Research Methods, 1, 45–87. Binet, A., & Simon, T. (1905). Méthodes nouvelles pour le diagnostic du niveau intellectual des anormaux [New methods for diagnosing the intellectual level of abnormals]. L'Année Psychologique, 11, 191–336. Bollen, K., & Lennox, R. (1991). Conventional wisdom on measurement: A structural equation perspective. Psychological Bulletin, 110, 305–314. Borsboom, D., Mellenbergh, G. J., & Van Heerden, J. (2003). The theoretical status of latent variables. Psychological Review, 110, 203. Brunner, M. (2008). No g in education? Learning and Individual Differences, 18, 152–165. Brunner, M., Nagy, G., & Wilhelm, O. (2012). A tutorial on hierarchically structured constructs. Journal of Personality, 80, 796–846. Buchner, A. (1995). Basic topics and approaches to the study of complex problem solving. In P. A. Frensch, & J. Funke (Eds.), Complex problem solving: The European perspective (pp. S. 27–S. 63). Hillsdale, NJ: Erlbaum. Carpenter, P. A., Just, M. A., & Shell, P. (1990). What one intelligence test measures: A theoretical account of the processing in the Raven progressive matrices test. Psychological Review, 97, 404–431. Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. New York: Cambridge University Press. Chen, F. F., West, S. G., & Sousa, K. H. (2006). A comparison of bifactor and second-order models of quality of life. Multivariate Behavioral Research, 41, 189–225. Cohen, P., Cohen, J., Aiken, L. S., & West, S. G. (1999). The problem of units and the circumstance for POMP. Multivariate Behavioral Research, 34, 315–346. Danner, D., Hagemann, D., Schankin, A., Hager, M., & Funke, J. (2011). Beyond IQ: A latent state-trait analysis of general intelligence, dynamic decision making, and implicit learning. Intelligence, 39, 323–334. Deary, I. J. (2012). Intelligence. Annual Review of Psychology, 63, 453–482. Deary, I. J., Strand, S., Smith, P., & Fernandes, C. (2007). Intelligence and educational achievement. Intelligence, 35, 13–21. Dörner, D. (1986). Diagnostik der operativen Intelligenz [Assessment of operative intelligence]. Diagnostica, 32, 290–308. Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological Methods, 5, 155–174. Eid, M., Lischetzke, T., Nussbeck, F. W., & Trierweiler, L. I. (2003). Separating trait effects from trait-specific method effects in multitrait-multimethod models: A multiple-indicator CT-C (M-1) model. Psychological Methods, 8, 38–60. Bennett, R. Elliot, Jenkins, F., Persky, H., & Weiss, A. (2003). Assessing complex problem solving performances. Assessment in Education: Principles, Policy & Practice, 10, 347–359. Fan, X., & Sivo, S. A. (2007). Sensitivity of Fit Indices to Model Misspecification and Model Types. Multivariate Behavioral Research, 42, 509–529. Fischer, A., Greiff, S., & Funke, J. (2011). The process of solving complex problems. The Journal of Problem Solving, 4, 19–42. Frensch, P. A., & Funke, J. (Eds.). (1995). Complex problem solving: The European perspective. Hillsdale, NJ: Erlbaum. Funke, J. (1992). Dealing with dynamic systems: Research strategy, diagnostic approach and experimental results. German Journal of Psychology, 16, 24–43. Funke, J. (1993). Microworlds based on linear equation systems: A new approach to complex problem solving and experimental results. In G. Strube, & K. F. Wender (Eds.), The cognitive psychology of knowledge (pp. 313–330). Amsterdam, The Netherlands: Elsevier Science Publishers. Funke, J. (2001). Dynamic systems as tools for analysing human judgement. Thinking and Reasoning, 7, 69–89. Funke, J. (2003). Problemlösendes Denken [Problem solving thinking]. Stuttgart, Germany: Kohlhammer. Funke, J. (2010). Complex problem solving: A case for complex cognition? Cognitive Processing, 11, 133–142. Gonzalez, C., Thomas, R. P., & Vanyukov, P. (2005). The relationships between cognitive ability and dynamic decision making. Intelligence, 33, 169–186. Gottfredson, L. (1997). Mainstream science on intelligence (editorial). Intelligence, 24, 13–23. Greiff, S. (2012). Individualdiagnostik der Problemlösefähigkeit [Assessment of problem solving]. Münster, Germany: Waxmann.
Greiff, S., Wüstenberg, S., & Funke, J. (2012). Dynamic problem solving: A new assessment perspective. Applied Psychological Measurement, 36, 189–213. Gustafsson, J. E. (1988). Hierarchical models of individual differences in cognitive abilities. In Robert J. Sternberg (Ed.), Advances in the psychology of human intelligence, vol. 4. (pp. 35–72)Hillsdale, NJ: Erlbaum. Gustafsson, J. E., & Aberg-Bengtsson, L. (2010). Unidimensionality and interpretability of psychological instruments. In S. E. Embretson (Ed.), Measuring Psychological Constructs (pp. 97–121). Washington, DC: American Psychological Association. Hall, R. J., Snell, A. F., & Singer Foust, M. (1999). Item parceling strategies in SEM: Investigating the subtle effects of unmodeled secondary constructs. Organizational Research Methods, 2, 233–256. Horn, J. L., & Noll, J. (1997). Human cognitive capabilities: Gf–Gc theory. In D. P. Flanagan, J. L. Genshaft, & P. L. Harrison (Eds.), Contemporary intellectual assessment: Theories, tests, and issues (pp. 53–91). New York: The Guilford Press. Hornung, C., Brunner, M., Reuter, R. A. P., & Martin, R. (2011). Children's working memory: Its structure and relationship to fluid intelligence. Intelligence, 39, 210–221. Hu, L., & Bentler, P. M. (1995). Evaluating model fit. In R. H. Hoyle (Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 76–99). Thousand Oaks, CA: Sage. Hu, L., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3, 424–453. Hunt, E. (2011). Human intelligence. New York: Cambridge University Press. Johnson, W., & Bouchard, T. J. (2005). The structure of human intelligence: It is verbal, perceptual, and image rotation (VPR), not fluid and crystallized. Intelligence, 33, 393–416. Johnson, W., Bouchard, T. J., Krueger, R. F., McGue, M., & Gottesman, I. I. (2004). Just one g consistent results from three test batteries. Intelligence, 32, 95–107. Kaufman, S. B., Reynolds, M. R., Liu, X., Kaufman, A. S., & McGrew, K. S. (2012). Are cognitive g and academic achievement g one and the same g? An exploration on the Woodcock–Johnson and Kaufman tests. Intelligence, 40, 123–138. Keller, U., & Sonnleitner, P. (2012). Genetics Lab scoring algorithm. Luxembourg: University of Luxembourg. Kersting, M. (2001). Zur Konstrukt- und Kriteriumsvalidität von Problemlöseszenarien anhand der Vorhersage von Vorgesetztenurteilen über die berufliche Bewährung [On the construct and criterion validity of problem solving scenarios on base of the prediction of supervisor ratings of job performance]. Diagnostica, 2, 67–76. Kluge, A. (2008). Performance assessments with microworlds and their difficulty. Applied Psychological Measurement, 32, 156–180. Kröner, S., Plass, J. L., & Leutner, D. (2005). Intelligence assessment with computer simulations. Intelligence, 33, 347–368. Kuhn, D. (2009). Do students need to be taught how to reason? Educational Research Review, 4, 1–6. Leutner, D., Fleischer, J., Wirth, J., Greiff, S., & Funke, J. (2012). Analytische und dynamische Problemlösekompetenz im Lichte internationaler Schulleistungsvergleichsstudien [Analytic and dynamic problem solving competence in international student assessments]. Psychologische Rundschau, 63, 34–42. Mayer, R. E. (2000). Intelligence and education. In R. J. Sternberg (Ed.), Handbook of intelligence (pp. 519–533). Cambridge, UK: Cambridge University Press. Mayer, R. E., & Wittrock, M. C. (1996). Problem-solving transfer. In D. C. Berliner, & R. C. Calfee (Eds.), Handbook of educational psychology (pp. 47–62). New York: Macmillan. McDonald, R. P. (2010). Structural models and the art of approximation. Perspectives on Psychological Science, 5, 675–686. McGrew, K. S. (2005). The Cattell–Horn–Carroll theory of cognitive abilities. In D. P. Flanagan, & P. L. Harrison (Eds.), Contemporary intellectual assessment: Theories, tests, and issues (pp. 136–181). New York: Guilford. McGrew, K. S. (2009). CHC theory and the human cognitive abilities project: Standing on the shoulders of the giants of psychometric intelligence research. Intelligence, 37, 1–10. Muthén, L. K., & Muthén, B. O. (1998–2010). Mplus user's guide (5th ed.)Los Angeles: CA: Muthén & Muthén. Naglieri, J. A., & Bornstein, B. T. (2003). Intelligence and achievement—Just how correlated are they. Journal of Psychoeducational Assessment, 21, 244–260. Paunonen, S. V., & Ashton, M. C. (2001). Big Five factors and facets and the prediction of behavior. Journal of Personality and Social Psychology, 81, 524–539. Prensky, M. (2001). Digital natives, digital immigrants Part 1. On the horizon, 9, 1–6. Quesada, J., Kintsch, W., & Gomez, E. (2005). Complex problem-solving: A field in search of a definition? Theoretical Issues in Ergonomics Science, 6, 5–33. Raven, J. C. (1958). Advanced progressive matrices (2nd ed.)London, UK: Lewis.
P. Sonnleitner et al. / Intelligence 41 (2013) 289–305 Reed, S. K. (2013). Problem solving. In S. Chipman (Ed.), Oxford Handbook of Cognitive Science. New York: Oxford University Press [in press].. Rigas, G., Carling, E., & Brehmer, B. (2002). Reliability and validity of performance measures in microworlds. Intelligence, 30, 463–480. Schmiedek, F., & Li, S. -C. (2004). Toward an alternative representation for disentangling age-associated differences in general and specific cognitive abilities. Psychology and Aging, 19, 40–56. Schulze, R. (2005). Modeling structures of intelligence. In O. Wilhelm, & R. W. Engle (Eds.), Handbook of understanding and measuring intelligence (pp. 241–263). Thousand Oaks, CA: Sage. Sonnleitner, P., Brunner, M., Greiff, S., Funke, J., Keller, U., Martin, R., et al. (2012). The Genetics Lab: Acceptance and psychometric characteristics of a computer-based microworld assessing complex problem solving. Psychological Test and Assessment Modeling, 54, 54–72. Sternberg, R. J., & Kaufman, J. C. (1996). Innovation and intelligence testing: The curious case of the dog that didn't bark. European Journal of Psychological Assessment, 12, 175–182. Sternberg, R. J., Lautrey, J., & Lubart, T. I. (2003). Where are we in the field of intelligence, how did we get here, and where are we going? In R. J. Sternberg, J. Lautrey, & T. I. Lubart (Eds.), Models of intelligence: International perspectives (pp. 3–26). Washington, DC: American Psychological Association. Süß, H. M. (1996). Intelligenz, Wissen und Problemlösen: kognitive Voraussetzungen für erfolgreiches Handeln bei computersimulierten Problemen [Intelligence, knowledge, and problem solving: Cognitive prerequisites for success in problem solving with computer-simulated problems]. Göttingen, Germany: Hogrefe. Swann, W. B., Chang-Schneider, C., & Larsen McClarty, K. (2007). Do people's self-views matter? Self-concept and self-esteem in everyday life. American Psychologist(62), 84–94.
305
Tapscott, D. (1998). Growing up digital: The rise of the net generation. New York: McGraw-Hill. Van der Sluis, S., Dolan, C. V., & Stoel, R. D. (2005). A note on testing perfect correlations in SEM. Structural Equation Modeling, 12, 551–577. Vollmeyer, R., Burns, B. D., & Holyoak, K. J. (1996). Impact of goal specifity on strategy use and acquisition of problem structure. Cognitive Science, 20, 75–100. Wagener, D., & Wittmann, W. W. (2002). Personalarbeit mit dem komplexen Szenario FSYS [Personnel work with the complex scenario FSYS]. Zeitschrift für Personalpsychologie, 1, 80–93. Wenke, D., Frensch, P. A., & Funke, J. (2005). Complex problem solving and intelligence—Empirical relation and causal direction. In R. J. Sternberg, & J. Pretz (Eds.), Cognition and intelligence: Identifying the mechanics of the mind (pp. 160–187). Cambridge, UK: Cambridge University Press. Wilhelm, O. (2005). Measuring reasoning ability. In O. Wilhelm, & R. W. Engle (Eds.), Handbook of understanding and measuring intelligence (pp. 373–392). Thousand Oaks, CA: Sage. Wirth, J., & Klieme, E. (2003). Computer-based assessment of problem solving competence. Assessment in Education: Principles, Policy & Practice, 10, 329–345. Wittmann, W. W., & Hattrup, K. (2004). The relationship between performance in dynamic systems and intelligence. Systems Research and Behavioral Science, 21, 393–409. Wüstenberg, S., Greiff, S., & Funke, J. (2012). Complex problem solving— More than reasoning? Intelligence, 40, 1–14.