Case-based learning with worked examples in complex domains: Two experimental studies in undergraduate medical education

Case-based learning with worked examples in complex domains: Two experimental studies in undergraduate medical education

Learning and Instruction 21 (2011) 22e33 www.elsevier.com/locate/learninstruc Case-based learning with worked examples in complex domains: Two experi...

178KB Sizes 4 Downloads 99 Views

Learning and Instruction 21 (2011) 22e33 www.elsevier.com/locate/learninstruc

Case-based learning with worked examples in complex domains: Two experimental studies in undergraduate medical education Robin Stark a,*, Veronika Kopp b, Martin R. Fischer b b

a Institute of Education, Saarland University, POB 151150, 66041 Saarbru¨cken, Germany Institute for Teaching and Educational Research in Health Sciences, University Witten/Herdecke, Alfred-Herrhausen-Str. 50, 58448 Witten, Germany

Received 18 December 2008; revised 12 May 2009; accepted 25 October 2009

Abstract To investigate the effects of example format (erroneous examples vs. correct examples) and feedback format (elaborated feedback vs. knowledge of results feedback) on medical students’ diagnostic competence in the context of a web-based learning environment containing casebased worked examples, two studies with a 2  2 design were conducted in the laboratory. In the first study (domain: arterial hypertension; N ¼ 153) erroneous examples were effective in combination with elaborated feedback with respect to strategic and conditional knowledge. In the second study (domain: hyperthyroidism, N ¼ 124), elaborated feedback supported all aspects of diagnostic competence, especially conditional knowledge. Ó 2009 Elsevier Ltd. All rights reserved. Keywords: Case-based learning; Diagnostic competence; Error format; Feedback format; Worked examples

1. Introduction Learning from worked examples is an effective method for initial learning in well-structured fields such as mathematics or physics (Atkinson, Derry, Renkl, & Wortham, 2000; for an overview see Paas & van Gog, 2006). Worked examples are superior to directly teaching abstract principles as well as to actively solving training problems e at least with regard to initial skill acquisition (VanLehn, 1999). This workedexample effect is attributed to the fact that studying worked examples imposes lower levels of cognitive load (Cooper & Sweller, 1987; Sweller & Cooper, 1985) on the learner than solving training problems, especially because no extensive search processes are involved. Therefore, more cognitive resources are free for the demanding process of schema construction. Moreover, studying worked examples (in

* Corresponding author. Tel.: þ49 681 3024111; fax: þ49 681 3024708. E-mail addresses: [email protected] (R. Stark), veronika.kopp@ uni-wh.de (V. Kopp), [email protected] (M.R. Fischer). 0959-4752/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.learninstruc.2009.10.001

contrast to attempting to solve training problems) focuses the learner’s attention on information that is relevant for schema construction. In various areas, schema-based problem solving is considered to be very effective and efficient. For example, it proved a central characteristic of experts’ problem solving (VanLehn, 1999). This also holds true for more complex domains like argumentation (Schworm & Renkl, 2007). However, when complex competencies have to be fostered, specific types of examples must be used. In addition, examples have to be enriched by additional instructional measures in order to foster self-explanation processes (Renkl, 1997; Stark, 1999). In this article, two studies are presented in which complex examples were used; these examples provided relevant information on different levels. Based on domain-specific perspectives, two instructional means were applied, namely erroneous examples and elaborated feedback. The main aim of the two studies presented here was to investigate the effects of these means on medical diagnostic competence in relevant content domains (Study 1: hypertension; Study 2: hyperthyroidism).

R. Stark et al. / Learning and Instruction 21 (2011) 22e33

1.1. Theoretical background for the design of examples Schworm and Renkl (2007) differentiate between doubleand triple-content examples. Double-content examples consist of two content levels: (a) the learning-domain level and (b) the exemplifying domain level. Triple-content examples additionally provide (c) the strategy level. Following Schworm and Renkl (2007), this type of example is similar to cognitive models involving the explanation of an expert’s cognitive processes while solving a problem (Collins, Brown, & Newman, 1989). It resembles to some extent process-oriented examples (Van Gog, Paas, & Van Merrie¨nboer, 2008) as information referring to different knowledge types, like principled and strategic information that experts use for problem solving, is integrated into the example solutions. Van Gog et al. (2008) showed that studying process-oriented examples increases germane cognitive load, resulting in higher performance in transfer test problems combined with lower investment of effort. In the two studies presented here, triple-content examples were used. The learning domain of medical diagnostic competence was exemplified in the content domains of arterial hypertension (Study 1) and hyperthyroidism (Study 2). These examples integrated information which can be referred to as domain-specific conceptual, strategic, and conditional knowledge (Paris, Lipson, & Wixson, 1983). Domain-specific conceptual knowledge consists of declarative knowledge about concepts and relations between concepts. Strategic knowledge comprises knowledge about problem-solving strategies and heuristics. Conditional knowledge is knowledge about the conditions of application of conceptual and strategic knowledge and also implies knowledge about the rationale behind the selection of decisions and procedures. All information necessary for diagnostic reasoning was given in the context of authentic patient cases. So examplebased learning was combined with case- and problem-based learning which have a rich tradition in the context of medical education (Norman & Schmidt, 2000). Based on studies of example-based learning (Atkinson et al., 2000) and studies of expertise in medicine (Boshuizen, Schmidt, Custers, & van de Wiel, 1995), case-based learning with complex examples was enriched by erroneous examples and elaborated feedback.

23

knowledge structures. The potential of errors in the learning process only unfolds when the learners clearly understand what is wrong and why something is wrong in a given situation (Curry, 2004). So on the one hand, there is emerging evidence that learning environments providing incorrect information can enhance learning outcomes. On the other hand, experiences of instructional experts and first experimental results underscore that the instructional potential of errors is no ‘‘fast-selling item’’. The studies of Große and Renkl (2004, 2007) in which the effectiveness of correct and incorrect examples was compared show that especially learners with solid prior knowledge profit from incorrect examples. These studies also can be interpreted in terms of the need for implementing effective feedback methods when erroneous examples are provided. 1.3. Feedback Feedback is regarded as an effective means to promote reflective processes and enhance learning (Balzer, Doherty, & O’Connor, 1989). Feedback supposed to foster academic learning is often subdivided into the following categories: knowledge of results (KOR), knowledge of correct response (KCR), and elaborated feedback (Dempsey, Driscoll, & Swindell, 1993). KOR-feedback confirms only whether the given answer is correct or incorrect; KCR-feedback provides the learner additionally with the correct answer; elaborated feedback includes more complex forms of feedback that explains, directs or monitors (Smith, 1988). In the context of this study, elaborated feedback mainly provides additional explanations focussing on conceptual, strategic and conditional knowledge. Elaborated feedback has positive effects on feedback reception (Jacoby, Troutman, Mazursky, & Kuss, 1984), and knowledge acquisition, especially when learners are confronted with complex tasks (Hattie & Timperley, 2007). Therefore, as the studies of Große and Renkl (2004, 2007) showed, elaborated feedback might be especially important when erroneous examples are provided. By pointing out mistakes and offering explanations or other additional information, it helps the students to reflect on the presented information as well as on their own knowledge and thereby facilitates elaboration of the material. However, for students with higher prior knowledge, KOR-feedback can be sufficient.

1.2. Learning from errors 1.4. Cognitive load In medical practice, diagnostic errors often have dramatic consequences (Al-Assaf, Bumpus, Carter, & Dixon, 2003). Initially, providing erroneous information to enhance learning seems to be counterintuitive. However, there are convincing arguments supporting this procedure. Learning from errors can foster the acquisition of ‘‘negative knowledge’’ (Oser & Spychiger, 2005) which provides an important protection against erroneous decisions and procedures. VanLehn (1999) postulates that errors trigger self-explanations, which lead to a deeper understanding and support the development of ‘‘illness scripts’’ (Boshuizen et al., 1995). Exposing learners to errors neither automatically results in negative knowledge nor in deep understanding and effective

From the perspective of cognitive load theory (CLT; Sweller, 1988), three types of load can be differentiated. Intrinsic load is determined by the complexity of the domain (quantified as element interactivity). Because of the complexity of the domain under investigation which is mirrored in the example solutions provided, the learning environment probably imposes high intrinsic load on the learners. Erroneous examples are supposed to increase germane load that in CLT is connected with effective elaboration processes leading to schema induction (Sweller, van Merrie¨nboer, & Paas, 1998). However, in combination with superficial KOR-feedback, learners are forced to explain to

24

R. Stark et al. / Learning and Instruction 21 (2011) 22e33

themselves why the provided information is incorrect and draw consequences for further diagnostic steps; these explanation processes can be highly different, depending on various factors, not the least from the learners’ prior knowledge. Therefore these processes can come along with further increase of germane load, but they also can impose additional extraneous load on the learner. This type of load is attributed to sub-optimal instructional design and interferes with productive learning. Because of the supposed high intrinsic load of the learning environment, cognitive overload can result under such learning conditions. However, this problem might be compensated by providing elaborated feedback. 2. Study 1 2.1. Research questions e hypotheses To what extent is diagnostic competence in the domain of hypertension facilitated by erroneous examples and elaborated feedback? It was expected that learners will profit from both the erroneous examples (Hypothesis 1a) and elaborated feedback (Hypothesis 1b). Moreover, an interaction between example and feedback format was expected, namely studying erroneous examples will enhance diagnostic competence especially when elaborated feedback is provided (Hypothesis 1c). How relevant is prior knowledge for the acquisition of diagnostic competence and to what extent is it interacting with example and feedback format? It was expected that high prior knowledge will be favourable in all learning conditions (Hypothesis 2a). However, this effect should be stronger when only KOR-feedback is provided; that is, there should be an interaction of prior knowledge with feedback format (Hypothesis 2b). Prior knowledge should also interact with example and feedback format; that is, its effect would be more pronounced when the students are confronted with erroneous examples and KOR feedback (Hypothesis 2c). How relevant is time-on-task for the acquisition of diagnostic competence and to what extent do effects of example and feedback format interact with time-on-task? It was hypothesized that investing more time-on-task will be favourable in all conditions, especially when erroneous examples and elaborated feedback are given; that is, there should be an interaction of time-on-task with example format (Hypothesis 3a), and with feedback format (Hypothesis 3b). What kind of influence do the instructional conditions exert on cognitive load? It was hypothesized that by providing erroneous examples cognitive load will be increased (Hypothesis 4a). Additionally, it was expected that cognitive load will be increased by KOR-feedback, especially when erroneous examples are provided; that is, there should be an interaction of example with feedback form (Hypothesis 4b). 2.2. Method 2.2.1. Sample and design Participants were 153 advanced medical students (104 women) from the two medical faculties in Munich, Germany,

who participated voluntarily in the present study. All participants were in the clinical years of the curriculum (3rd to 5th year). The mean age was 25.02 years (SD ¼ 3.62). The participants were randomly assigned to one of the four learning conditions of a 2  2 factorial design: (a) ‘‘with errors and elaborated feedback’’ (n ¼ 36); (b) ‘‘with errors and KORfeedback’’ (n ¼ 41); (c) ‘‘without errors and elaborated feedback’’ (n ¼ 40); (d) ‘‘without errors and KOR-feedback’’ (n ¼ 36). 2.2.2. Learning environment The learning environment was integrated into the CASUS learning platform (Fischer, 2000) and consisted primarily of six worked examples on secondary arterial hypertension. Working through these examples, the learners see themselves in the role of a (fictitious) student doing an elective with a medical expert giving feedback. All examples begin with a clinical situation (see Appendix A). The protagonist starts drawing conclusions for the working diagnosis and differential diagnosis, as well as for further diagnostic steps. Subsequently, the expert gives feedback. Every example consists of three to four of such sequences until the final diagnosis is reached. 2.2.3. Instructional measures 2.2.3.1. Erroneous examples. In the conditions without errors, correct information was given. The protagonist draws the right conclusions and finishes the case with the right diagnosis. In the conditions with errors, the protagonist makes severe errors (see Appendix A). In each example, four to five errors were integrated. The selection of errors refers to the taxonomy from Graber, Gordon, and Franklin (2002). After every wrong decision, the error is corrected by the expert’s feedback. That is, in contrast to the studies of Große and Renkl (2004, 2007), the erroneous information in our studies is presented only in the provided diagnostic decisions and solutions, the problem to be solved remains the same as in the conditions without errors. 2.2.3.2. Elaborated feedback. In the conditions with elaborated feedback, the expert gives additional explanations of the considerations and further consequences drawn by the protagonist. Furthermore, he elucidates the diagnostic process referring to underlying domain-specific conceptual knowledge and combining it with strategic and conditional knowledge (see Appendix B). In the conditions with KOR-feedback, considerations, conclusions, and further procedures were only evaluated as right or wrong without further explanation. Thus, in the condition ‘‘with errors and KOR-feedback’’ the learners had to deduce the right diagnostic step from the subsequent step. 2.2.4. Instruments 2.2.4.1. Prior knowledge. The prior knowledge test operationalizing domain-specific conceptual knowledge consisted of 21 multiple-choice questions (see Appendix C); two questions were excluded because of negative total-item

R. Stark et al. / Learning and Instruction 21 (2011) 22e33

correlations. The maximum score was therefore 19 points (Cronbach’s a ¼ 63). 2.2.4.2. Diagnostic competence. Diagnostic competence was operationalized as domain-specific conceptual knowledge, strategic knowledge, and conditional knowledge. To measure domain-specific conceptual knowledge the 19 items of the prior knowledge test were presented again after the learning phase (Cronbach’s a ¼ .60). To measure strategic knowledge, 10 key-feature problems were used (Bordage, Brailovsky, Carretier, & Page, 1995). They consisted of a short clinical case scenario and three questions asking for the leading diagnostic hypotheses, differential hypotheses and the next steps in the diagnostic process (see Appendix D). Two experienced physicians compared students’ answers to the solution of an expert. Interrater reliability was high (Cohen’s k ¼ .93). One question was excluded because of low item-total correlation. Therefore, a maximum of 29 points could be achieved (Cronbach’s a ¼ .72). To measure conditional knowledge three problem-solving tasks were used. Based on the information of a short case scenario, the students had to generate their leading diagnostic hypotheses in a first step; in following steps, they were asked to give reasons for their decisions and to explain the diseases’ underlying pathophysiological processes (see Appendix E). Students’ answers were rated by the experts (Cohen’s k ¼ .89). A total of 20 points could be achieved (Cronbach’s a ¼ .73). The correlation between strategic and conditional knowledge was substantial (r ¼ .64, p < .01). The correlations between domain-specific conceptual knowledge and the two other aspects of knowledge were moderate (r ¼ .56, p < .01 with strategic knowledge, and r ¼ .45, p < .01 with conditional knowledge). 2.2.4.3. Cognitive load. It was assessed by a rating scale of Paas and Kalyuga (2005) with nine items ranging from 1 (very low/very easy) to 7 (very high/very difficult); example item is ‘‘When working with the learning environment, my mental effort was.’’). Different load aspects were not differentiated because the subscales proposed by the authors could not be replicated (Cronbach’s a ¼ .90). 2.2.4.4. Time-on-task. Time-on-task was recorded automatically by the learning environment. 2.2.5. Procedure After a short introduction into the learning environment, students had to work on the prior knowledge test. Afterwards, they had to study the six worked examples and answer the cognitive load scale. After a 15-min pause, the three knowledge tests described above were administered in a written form. 2.2.6. Statistical analyses An alpha level of .05 was used for all statistical analyses. Partial h2 was used as a measure of effect size; values of about .01 are considered as weak effect size, of about .06 as medium,

25

and of about .14 or bigger as large (Cohen, 1988). A MANCOVA was conducted with the three aspects of diagnostic competence as dependent variables, and example and feedback format as between subjects factors. In Study 1, prior knowledge and time-on-task were included as covariates; in Study 2, metacognitive control was added to the covariates. All covariates were z-standardized. Significant interaction effects between each covariate and the example-format and feedback-format factors as well as the corresponding triple interactions were analyzed with univariate analyses when needed. In case of significant univariate interaction effects between the two factors, post-hoc analyses were applied to compare the respective means (linear independent, pairwise and Bonferroni-adjusted contrasts between residualized means). In case of significant interactions between the example-format and feedback-format factors and covariates, the regression coefficient b and R2 were calculated by multiple regressions. To analyze effects of example and feedback format on cognitive load, an ANCOVA was used with two (Study 1) and three (Study 2) covariates. Bivariate correlations were calculated using Pearson’s product-moment correlation. 2.3. Results 2.3.1. Internal validity Concerning prior knowledge, there were no differences between the four conditions. The main effects of example and feedback format were nonsignificant, F(1, 149) ¼ 2.65, p ¼ .11 and F(1, 149) ¼ 1.92, p ¼ .17, respectively; also the interaction of example format with feedback format was nonsignificant, F(1, 149) ¼ 1.72, p ¼ .19. The correlations of prior knowledge with diagnostic competence were significant and substantial. Specifically, for domain-specific conceptual knowledge, r ¼ .68, p < .01, for strategic knowledge, r ¼ .58, p < .01, for conditional knowledge, r ¼ .42, p < .01. With respect to time-on-task, the example format had no main effect, F(1, 149) < 1, ns. However, the main effect of feedback format was significant and substantial, F(1, 149) ¼ 36.15, p < .01, partial h2 ¼ .19. Elaborated feedback prolonged the learning sessions. The interaction between example format and feedback format was nonsignificant, F(1, 149) < 1, ns. Time-on-task was not significantly related to diagnostic competence (for domain-specific conceptual knowledge, r ¼ .09; for strategic knowledge, r ¼ .05; for conditional knowledge, r ¼ .01). The effects of both variables, namely prior knowledge and time-on-task, were statistically controlled by including them as covariates in the analyses. 2.3.2. Effects on diagnostic competence Table 1 shows means and standard deviations of all measures. When the three aspects of diagnostic competence were analysed simultaneously, the 2(example format)  2(feedback format) MANCOVA with prior knowledge and time-on-task as covariates showed that the multivariate effect of the example

R. Stark et al. / Learning and Instruction 21 (2011) 22e33

26

format was not significant, Wilks’s l ¼ .95, F(3, 139) ¼ 2.43, p ¼ .07. The multivariate effect of the feedback format was also not significant, Wilks’s l ¼ .99, F(3, 139) < 1, ns. Therefore, Hypothesis 1a and 1b were not confirmed. However, the interaction between the example and feedback formats was significant, Wilks’s l ¼ .92, F(3, 139) ¼ 3.80, p < .01, partial h2 ¼ .08. The interaction of example and feedback format was not significant in the case of domain-specific conceptual knowledge, F(1, 141) ¼ 1.26, p ¼ .26. However, for strategic and conditional knowledge, it was significant, F(1, 141) ¼ 10.68, p < .01, partial h2 ¼ .07, and F(1, 141) ¼ 5.44, p ¼ .02, partial h2 ¼ .04, respectively. Learners in the conditions ‘‘with errors’’ generated more strategic knowledge when elaborated feedback was given (M ¼ 20.68, SD ¼ 3.02) as compared to KOR-feedback (M ¼ 17.81, SD ¼ 2.96, post hoc comparison p < .01); in the conditions ‘‘without errors’’, KOR-feedback did not differ (M ¼ 18.85, SD ¼ 4.42) from elaborated feedback (M ¼ 18.32, SD ¼ 4.04, post hoc comparison p ¼ .08). Similar tendencies were found for conditional knowledge, although they were not significant. Therefore, Hypothesis 1c was partly verified. 2.3.3. Effects of prior knowledge and time-on-task The MANCOVA showed that the strongest predictor for diagnostic competence was prior knowledge, Wilks’s l ¼ .43, F(3, 139) ¼ 62.06, p < .01, partial h2 ¼ .57. However, the interactions between prior knowledge and example format, between prior knowledge and feedback format, and between prior knowledge and both example and feedback formats were not significant, Wilks’s l ¼ .97, F(3, 139) ¼ 1.45, p ¼ .23, Wilks’s l ¼ .96, F(3, 139) ¼ 1.82, p ¼ .15, and Wilks’s l ¼ 1.00, F(3, 139) < 1, ns, respectively. Therefore, Hypotheses 2b and 2c were not corroborated. The ANCOVAs performed showed that the main effect of prior knowledge was significant and substantial for all three aspects of diagnostic competence; specifically, for domainspecific conceptual knowledge, F(1, 141) ¼ 140.21, p < .01, partial h2 ¼ .50, for strategic knowledge, F(1, 141) ¼ 74.63, p < .01, partial h2 ¼ .35, for conditional knowledge, F(1, 141) ¼ 28.07, p < .01, partial h2 ¼ .17. The positive correlations between prior knowledge and diagnostic competence (see above) indicate that learners profited from higher prior knowledge. Thus, Hypothesis 2a was confirmed.

In the MANCOVA, the main effect of time-on-task was not significant, Wilks’s l ¼ .95, F(3, 139) ¼ 2.30, p ¼ .08. The interactions between time-on-task and example format and between time-on-task and feedback format were also not significant, Wilks’s l ¼ .97, F(3, 139) ¼ 1.34, p ¼ .27, and Wilks’s l ¼ .98, F(3, 139) < 1, ns, respectively. Therefore, Hypotheses 3a and 3b were not corroborated. However, the three-way interaction of time-on-task by example format by feedback format was significant, Wilks’s l ¼ .94, F(3, 139) ¼ 3.17, p ¼ .03, partial h2 ¼ .06. The ANCOVA for strategic knowledge revealed that the three-way interaction of time-on-task by example format by feedback format was significant, F(1, 141) ¼ 6.30, p < .01, partial h2 ¼ .04. For conceptual and conditional knowledge, this interaction was not significant, F(1, 141) ¼ 3.00, p ¼ .09, and F(1, 141) < 1, ns, respectively. The pattern of regression coefficients showed that investing more time-on-task was most effective for the condition ‘‘without errors and KOR-feedback’’, b ¼ 2.10, p ¼ .03; R2 ¼ .37. However, it was not effective for the other three conditions, that is, ‘‘without errors and elaborated feedback’’, b ¼ .06, p ¼ .90; R2 ¼ .55; ‘‘with errors and KOR-feedback’’, b ¼ .77, p ¼ .29, R2 ¼ .35; ‘‘with errors and elaborated feedback’’, b ¼ .28, p ¼ .54, R2 ¼ .20. To sum up, the analyses showed that diagnostic competence was not influenced by the example and by the feedback format. However, it was shown that erroneous examples combined with elaborated feedback enhanced strategic and conditional knowledge significantly. Independent from the learning condition, higher prior knowledge supported learning outcomes, while there was no interaction of prior knowledge with example and feedback format. Investing more time-ontask only paid off with respect to strategic knowledge, at least for one learning condition. Whereas the effects of prior knowledge were substantial, all other effects were small or medium-sized.

2.3.4. Effects on cognitive load Cognitive load correlated negatively with domain-specific conceptual knowledge (r ¼ .15, p ¼ .06), strategic knowledge (r ¼ .17, p ¼ .03), and with conditional knowledge (r ¼ .29, p < .01). That is, learners reporting higher cognitive load scores acquired less diagnostic competence.

Table 1 Means (and SD) of prior knowledge, time-on-task, domain-specific conceptual knowledge, strategic knowledge, conditional knowledge, and cognitive load measures in the four learning conditions. With errors

Prior knowledge Time-on-task (min) Conceptual knowledge Strategic knowledge Conditional knowledge Cognitive load

Without errors

Elaborated n ¼ 36

KOR n ¼ 41

Elaborated n ¼ 40

KOR n ¼ 36

12.06 44.60 14.69 20.68 11.93 3.35

10.83 31.37 12.83 17.81 9.93 3.58

12.20 45.49 13.38 18.32 10.86 3.37

12.17 33.70 13.03 18.85 11.63 3.13

(2.47) (14.70) (2.35) (3.02) (3.65) (0.58)

(2.54) (7.75) (2.28) (2.96) (3.53) (0.57)

(3.05) (16.87) (2.72) (4.04) (3.70) (0.79)

(3.12) (10.02) (2.76) (4.42) (3.27) (0.62)

R. Stark et al. / Learning and Instruction 21 (2011) 22e33

27

Descriptively, learners in the condition ‘‘with errors and KOR-feedback’’ displayed the highest cognitive load scores whereas students in the condition ‘‘without errors and KORfeedback’’ reported the lowest ones (see Table 1). Erroneous examples tended to increase cognitive load; however, this effect was not significant, F(1, 147) ¼ 3.21, p ¼ .08, so Hypothesis 4a was not confirmed. Moreover, the main effect of feedback format was not significant, F(1, 147) < 1, ns. However, the interaction of example format with feedback format was significant, F(1, 147) ¼ 4.07, p ¼ .05, partial h2 ¼ .03. In line with Hypothesis 4b, providing erroneous examples enhanced cognitive load especially when only KORfeedback was given. Concerning the two covariates, the main effect of prior knowledge was significant, F(1, 147) ¼ 4.87, p ¼ .03, partial h2 ¼ .03, whereas the main effect of time-ontask was not, F(1, 147) < 1, ns. The correlation between cognitive load and prior knowledge was negative (r ¼ .21, p < .01), indicating that learners with low prior knowledge reported higher load scores (and vice versa).

instructional means that only achieved small effects; it also had a stronger effect than time-on-task as a predictor that has proved important in other studies (Gettinger, 1984). In the present study, time-on-task only resulted in very local and rather counterintuitive effects that need further replication. The importance of prior knowledge is anything but new in research on learning and instruction (Dochy, Segers, & Buehl, 1999). Together with the fact that prior knowledge did not interact with the two factors, it shows that the learning environment was demanding under all conditions. Especially in the conditions with KOR-feedback, any information deficit has to be compensated by effective self-explanation (Chi, Bassok, Lewis, Reimann, & Glaser, 1989). This is a demanding task not only with respect to cognitive but also to metacognitive learning prerequisites that support activities of metacognitive control like planning, monitoring, and regulation of learning processes (Stark, 1999). Based on these considerations, metacognitive prerequisites were recorded in the following study and analysed as a potential moderating variable.

2.4. Discussion

3. Study 2

There were no bottom or ceiling effects in Study 1, indicating that the complexity of posttest tasks was adequate. Interestingly, the effects of example and feedback format depended on the aspect of diagnostic competence under consideration. Acquisition of strategic and conditional knowledge was enhanced by erroneous examples and elaborated feedback. As the latter diagnostic competence aspects are more complex than domain-specific conceptual knowledge, these findings are in line with considerations about conditions for learning from errors (Oser & Spychiger, 2005; VanLehn, 1999). Furthermore, they are in accordance with feedback research. When complex competencies have to be acquired in demanding learning settings, elaborated feedback is superior to less informative feedback (Hattie & Timperley, 2007). Last but not least, various studies on example-based learning have shown that instructional measures increasing the complexity of the learning process have to be compensated by additional support (Stark, Gruber, Mandl, & Hinkofer, 2001). Therefore it is plausible that the rich elaborated feedback information given in our study compensated for the increased complexity of the error-conditions. These findings were substantiated by the effects of example and feedback format on cognitive load which are in line with CLT (Sweller et al., 1998). As the load scores were negatively related to knowledge acquisition, the conclusion is suggestive that erroneous examples in combination with KOR-feedback reduced the worked-example effect (Renkl, 2005) by imposing too much extraneous load on the learners. The extra-information provided in the conditions with elaborated feedback did not increase cognitive load, so our a priori design considerations might have paid off. With respect to methodology, our findings on cognitive load support the idea that the rating scale used does not (or at least not primarily) measure germane load. Prior knowledge was the strongest predictor for diagnostic competence. It was not only more important than the

3.1. Metacognition and selection of the domain Metacognition, the ability to reflect on one’s own cognitive skills and processes, is seen as an important ability to successfully cope with cognitively demanding tasks (Flavell, 1984; Hasselhorn, 1998). There is evidence that especially metacognitive control strategies are closely related to the learning outcomes (Lan, Bradley, & Parr, 1993). Three classes of metacognitive strategies can be differentiated in self-regulated learning (Zimmerman, 2000): goal setting, planning, and monitoring. These strategies of metacognitive control are essential when learners are confronted with errors and have to detect, analyse and understand them, especially when only KOR-feedback is given. By providing instructionally welldesigned elaborated feedback, relevant reflection processes necessary to understand erroneous solutions can be enhanced so that at least to some extent metacognitive deficits can be compensated (Stark, Tyroller, Krause, & Mandl, 2008). To further investigate effects of erroneous examples and elaborated feedback in the second study, more complex worked examples were created in the field of hyperthyroidism. In the context of our studies, case complexity can be quantified as the sum of the differential diagnoses plus the number of adequate diagnostic tests which have to be carried out to exclude potential alternate diagnoses and to verify the final diagnosis. 3.2. Research questions e hypotheses To what extent is diagnostic competence in the domain of hyperthyroidism facilitated by erroneous examples and elaborated feedback? The same hypotheses were formulated as in Study 1. Again it was expected that learners will profit from erroneous examples (Hypothesis 1a) and elaborated feedback (Hypothesis 1b). Moreover, an interaction between example

28

R. Stark et al. / Learning and Instruction 21 (2011) 22e33

and feedback format was expected, namely studying erroneous examples will enhance diagnostic competence especially when elaborated feedback is provided (Hypothesis 1c). How relevant are prior knowledge and metacognitive control strategies for the acquisition of diagnostic competence and to what extent do they interact with example and feedback format? It was expected that high prior knowledge will be favourable in all learning conditions (Hypothesis 2a). However, this effect should be stronger when only KORfeedback is provided; that is, there should be an interaction of prior knowledge with feedback format (Hypothesis 2b). Prior knowledge should also interact with example format; that is, its effect would be more pronounced when the students are confronted with erroneous examples (Hypothesis 2c). In addition, we expected that metacognitive control will enhance diagnostic competence under all learning conditions, particularly in the case of erroneous examples provided in combination with KOR-feedback; that is there should be an interaction of metacognitive control with example format and feedback format (Hypothesis 2d). How relevant is time-on-task for the acquisition of diagnostic competence and to what extent are effects of example and feedback format moderated by time-on-task? Concerning time-on-task, it was hypothesized that investing more time-ontask will be favourable especially when erroneous examples and elaborated feedback are given; that is, there should be an interaction of time-on-task with example format (Hypothesis 3a) and with feedback format (Hypothesis 3b). How do the instructional means affect cognitive load? We hypothesized that by providing erroneous examples cognitive load will be increased (Hypothesis 4a). Additionally, it was expected that cognitive load will be increased by KOR-feedback, especially when erroneous examples are provided, that is, there should be an interaction of example format with feedback format (Hypothesis 4b). 3.3. Method 3.3.1. Sample and design Participants were 124 advanced medical students (85 women) from the two medical faculties in Munich, Germany, who also participated voluntarily. The sample was drawn from the same population as in Study 1. The mean age was 25.49 years (SD ¼ 3.35). The participants were randomly assigned to one of the same four learning conditions as in Study 1: (a) ‘‘with errors and elaborated feedback’’ (n ¼ 30); (b) ‘‘with errors and KOR-feedback’’ (n ¼ 32); (c) ‘‘without errors and elaborated feedback’’ (n ¼ 33); (d) ‘‘without errors and KORfeedback’’ (n ¼ 29). 3.3.2. Learning environment The design and structure of the learning environment was analogous to Study 1. We only changed the exemplifying domain in which the worked examples were embedded. Specifically, in Study 2 they dealt with patients presenting symptoms of hyperthyroidism. Additionally, the example solutions were a bit longer.

3.3.3. Instruments 3.3.3.1. Prior knowledge. Domain-specific prior knowledge was assessed by 25 multiple-choice items. Because of low item-total correlation two items were excluded from further analysis. A maximum of 23 points could be reached (Cronbach’s a ¼ .61). 3.3.3.2. Metacognitive control. A questionnaire was constructed for the needs of the present study to assess metacognitive control. It was composed of 12 items focusing on planning (e.g., ‘‘When I study, I set clear goals on what I want to achieve before I start’’), monitoring (‘‘When I study, in between I think about whether my approach makes sense’’) and regulation (e.g., ‘‘When I study and get stuck, I think about how I could study differently’’). The items were taken from various self-report scales (e.g., the Motivated Strategies for Learning Questionnaire from Pintrich, Smith, Garcia, & McKeachie, 1993). Responses were on a 6-point Likert-type scale ranging from 1 (I fully disagree) to 6 (I fully agree). For metacognitive control, one score was computed based on the mean of the responses on all items. Cronbach’s a of the scale was .77. 3.3.3.3. Diagnostic competence. Diagnostic competence was operationalized as in Study 1. The 23 items of the prior knowledge test was used again after the learning phase to assess domain-specific conceptual knowledge (Cronbach’s a ¼ .69). Ten key-feature problems were used for assessing strategic knowledge. Every problem consisted of a short clinical case scenario and a question asking for the leading diagnostic hypotheses, for differential hypotheses and the next steps in the diagnostic process. A maximum of 36 points could be achieved (Cronbach’s a ¼ .77). Two experienced physicians rated the students’ answers; interrater reliability was high (Cohen’s k ¼ .93). As in Study 1, three problem-solving tasks measured conditional knowledge (maximum 34 points; Cronbach’s a ¼ .79; Cohen’s k ¼ .90). Correlations between the single aspects of diagnostic competence were substantial. Specifically, between domainspecific conceptual knowledge and strategic knowledge r ¼ .63, p < .01, between domain-specific conceptual knowledge and conditional knowledge, r ¼ .58, p < .01, and between strategic knowledge and conditional knowledge, r ¼ .77, p < .01. 3.3.3.4. Cognitive load. The same rating scale was used as in Study 1 (Cronbach’s a ¼ .92). 3.3.3.5. Time-on-task. Time-on-task was recorded automatically by the learning environment.

3.3.4. Procedure The procedure was the same as in Study 1. Metacognitive control was assessed before students worked on the prior knowledge test.

R. Stark et al. / Learning and Instruction 21 (2011) 22e33

3.4. Results 3.4.1. Internal validity There were no differences in prior knowledge between the four conditions. The main effects of example format, F(1, 120) ¼ 1.37, p ¼ .24, and of feedback format, F(1, 120) ¼ 1.52, p ¼ .22, were nonsignificant. The interaction of example format with feedback format was also nonsignificant, F(1, 120) < 1, ns. The correlations of prior knowledge with diagnostic competence were significant and substantial. Specifically, for domain-specific conceptual knowledge, r ¼ .71, p < .01, for strategic knowledge, r ¼ .43, p < .01, for conditional knowledge, r ¼ .45, p < .01. Concerning metacognitive control, there were also no significant a priori differences between the four conditions. The main effects of example format and of feedback format were nonsignificant, F(1, 118) ¼ 2.27, p ¼ .13 and F(1, 118) < 1, ns, respectively. The interaction of example format with feedback format was also nonsignificant, F(1, 118) < 1, ns. The correlation of metacognitive control with domain-specific conceptual knowledge failed significance (r ¼ .11, p ¼ .21). The correlations with strategic and conditional knowledge were low but significant (r ¼ .21, p ¼ .02 and r ¼ .20, p ¼ .03). Concerning time-on-task, the main effect of example format was not significant, F(1, 120) ¼ 3.16, p ¼ .08. However, the main effect of feedback format was significant and substantial, F(1, 120) ¼ 19.47, p < .01, partial h2 ¼ .14, meaning that elaborated feedback increased time-on-task. The interaction between example format and feedback format was not significant, F(1, 120) ¼ 1.55, p ¼ .22. Time-on-task did not significantly correlate with measures of diagnostic competence (with domain-specific conceptual knowledge, r ¼ .04; strategic knowledge, r ¼ .11; with conditional knowledge, r ¼ .17). Prior knowledge, metacognitive control, and time-on-task were included as covariates in the following analyses. 3.4.2. Effects on diagnostic competence Table 2 shows the means and standard deviations of all measures. The 2(example format)  2(feedback format) MANCOVA with prior knowledge, metacognitive control, and time-on-task as covariates and the three measures of diagnostic competence as dependent variables showed that there was no main effect of the example format, Wilks’s l ¼ .98, F(3,103) < 1, ns. Thus,

29

Hypothesis 1a was not confirmed. However, the main effect of feedback format was significant and substantial, Wilks’s l ¼ .83, F(3, 103) ¼ 6.89, p < .01, partial h2 ¼ .17. The interaction between example and feedback format was not significant, Wilks’s l ¼ .94, F(3, 103) ¼ 2.20, p ¼ .09. Therefore, Hypothesis 1c was not corroborated. The ANCOVAs concerning the feedback format showed significant effects for all three aspects of diagnostic competence. Specifically, for domain-specific conceptual knowledge, F(1, 105) ¼ 6.60, p < .01, partial h2 ¼ .06, for strategic knowledge, F(1, 105) ¼ 8.13, p < .01, partial h2 ¼ .07, for conditional knowledge, F(1, 105) ¼ 19.64, p < .01, partial h2 ¼ .16. Therefore, as predicted in Hypothesis 1b elaborated feedback enhanced diagnostic competence. 3.4.3. Effects of prior knowledge, metacognitive control, and time-on-task The 2(example format)  2(feedback format) MANCOVA with prior knowledge, time-on-task, and metacognitive control as covariates showed that prior knowledge had a strong effect on diagnostic competence, Wilks’s l ¼ .51, F(3, 103) ¼ 33.22, p < .01, partial h2 ¼ .49. The interaction between prior knowledge and example format was not significant, Wilks’s l ¼ .93, F(3, 103) ¼ 2.56, p ¼ .06. In contrast to Hypothesis 2b, the interaction with feedback format was not significant, Wilks’s l ¼ .99, F(3, 103) < 1, ns. This was also true for the triple interaction, Wilks’s l ¼ .96, F(3, 103) ¼ 1.54, p ¼ .21. The ANCOVAs showed that the main effect of prior knowledge was significant and substantial for all three aspects of diagnostic competence; specifically, for domain-specific conceptual knowledge, F(1, 105) ¼ 96.96, p < .01, partial h2 ¼ 48, for strategic knowledge, F(1, 105) ¼ 13.37, p < .01, partial h2 ¼ .11, for conditional knowledge, F(1, 105) ¼ 17.01, p < .01, partial h2 ¼ .14. Thus, Hypothesis 2a was corroborated. The MANCOVA also showed that metacognitive control had no effect on diagnostic competence, Wilks’s l ¼ .95, F(3, 103) ¼ 1.90, p ¼ .14. The interactions with example and feedback format also were not significant, Wilks’s l ¼ .97, F(3, 103) ¼ 1.23, p ¼ .30, and Wilks’s l ¼ .99, F(3, 103) < 1, ns. In contrast to Hypothesis 2d, the triple interaction of metacognitive control by example format by feedback format also was not significant, Wilks’s l ¼ .93, F(3, 103) ¼ 2.58, p ¼ .06. The MANCOVA also showed that time-on-task had no significant effect on diagnostic competence, Wilks’s l ¼ .98,

Table 2 Means (and SD) of prior knowledge, metacognitive control, time-on-task, domain-specific conceptual knowledge, strategic knowledge, conditional knowledge, and cognitive load measures in the four learning conditions. With errors

Prior knowledge Time-on-task (min) Metacognitive control Conceptual knowledge Strategic knowledge Conditional knowledge Cognitive load

Without errors

Elaborated n ¼ 36

KOR n ¼ 41

Elaborated n ¼ 40

KOR n ¼ 36

16.13 60.32 4.68 19.13 26.40 13.99 3.06

15.88 50.93 4.72 17.44 23.25 9.68 3.53

15.91 69.28 4.57 18.27 25.76 13.49 2.87

14.79 52.50 .51 17.28 25.29 10.71 3.15

(2.43) (15.12) (0.49) (2.22) (3.27) (4.05) (0.63)

(3.56) (13.09) (0.74) (2.99) (3.71) (4.56) (0.75)

(3.28) (20.33) (0.56) (3.03) (4.10) (5.39) (0.83)

(2.96) (16.32) (0.64) (3.16) (3.56) (4.22) (0.70)

30

R. Stark et al. / Learning and Instruction 21 (2011) 22e33

F(3, 103) < 1, ns. In contrast to Hypothesis 3a, the interaction between time-on-task and example format was not significant, Wilks’s l ¼ .94, F(3, 103) ¼ 2.31, p ¼ .08, whereas the interaction with feedback format was significant, Wilks’s l ¼ .92, F(3, 103) ¼ 2.88, p ¼ .04, partial h2 ¼ .08. The triple interaction was not significant, Wilks’s l ¼ .99, F(3, 103) < 1, ns. The ANCOVAs for feedback format effects and time-ontask as covariate were significant for strategic and conditional knowledge, F(1, 105) ¼ 7.48, p ¼ .01, partial h2 ¼ .07, and F(1, 105) ¼ 6.03, p ¼ .02, partial h2 ¼ .05, respectively, but not for domain-specific conceptual knowledge, F(1, 105) ¼ 1.28, p ¼ .26. The pattern of regression coefficients showed that in the conditions with elaborated feedback, investing more learning time supported acquisition of strategic knowledge, b ¼ .05, p ¼ .05, R2 ¼ .28, whereas in the conditions with KOR-feedback, higher learning time was associated with lower strategic knowledge, b ¼ .06, p ¼ .09, R2 ¼ 25. This also held true for conditional knowledge, that is, for elaborated feedback, b ¼ .07, p ¼ .03, R2 ¼ .28, and for KORfeedback, b ¼ .06, p ¼ .09, R2 ¼ .26. Therefore, Hypothesis 3b was confirmed for strategic and conditional knowledge. To sum up, all aspects of diagnostic competence were fostered by elaborated feedback. With respect to prior knowledge, a strong effect for all conditions was identified. Concerning metacognitive control, no effect was found. Timeon-task enhanced strategic and conditional knowledge when learners received elaborated feedback. 3.4.4. Effects on cognitive load Cognitive load correlated significantly with all the three aspects of diagnostic competence; specifically, with domainspecific conceptual knowledge (r ¼ .28, p ¼ .01), with strategic knowledge (r ¼ .34, p ¼ .01), and with conditional knowledge (r ¼ .27, p ¼ .01). Thus, learners with higher cognitive load scores in the learning sessions were less successful in the posttests. Students in the condition ‘‘with errors and KOR-feedback’’ reported the highest cognitive load scores; in the condition ‘‘without errors and KOR’’ the lowest scores appeared (see Table 2). The main effect of example format was significant, F(1, 115) ¼ 4.08, p ¼ .05, partial h2 ¼ .03. The feedback format was also significant, F(1, 115) ¼ 7.09, p ¼ .01, partial h2 ¼ .06. According to Hypothesis 4a, cognitive load was increased by erroneous examples. It was also enhanced by KOR-feedback. The interaction of example format with feedback format was not significant, F(1, 115) < 1, ns. Therefore, Hypothesis 4b was not fully corroborated. None of the three covariates had a significant influence on cognitive load; specifically, for prior knowledge, F(1, 115) ¼ 2.94, p ¼ .09, and for both metacognitive control and time-on-task, F(1, 115) < 1, ns. 3.5. Discussion In Study 2, the complexity of the tests used for assessing diagnostic competence was adequate, no bottom or ceiling effects appeared. Elaborated feedback supported all aspects of

diagnostic competence, most of all conditional knowledge, whereas erroneous examples as such did not influence knowledge acquisition. In addition, there was no interaction effect between both instructional means. Coping with erroneous examples increased cognitive load, and receiving only KOR-feedback also led to higher cognitive load. As cognitive load and learning outcomes correlated negatively and were independent from prior knowledge and metacognitive control prerequisites, the consequence is selfevident: in Study 2 erroneous examples and KOR-feedback induced processes that at least to some extent interfered with effective learning. These findings are in line with literature on learning with errors (Oser & Spychiger, 2005) as well as feedback literature (Hattie & Timperley, 2007) and studies on conditions and effects of cognitive load (Sweller et al., 1998). Analogous to Study 1, elaborated feedback increased timeon-task substantially. However, as time-on-task was controlled for in the following analyses and, in addition, was independent from learning outcomes, internal validity of the study was not compromised. The large prior knowledge effect was independent from the learning condition. As prior knowledge was assessed by the multiple-choice test used for domain-specific conceptual knowledge, competence aspects that likely interact with example and feedback format probably were not tapped. Interestingly, metacognitive control had no influence on learning outcomes and did not interact with example and feedback formats. These findings which do not correspond with studies on the relevance of metacognitive control for complex, self-regulated learning (Stark et al., 2008) might indicate an artefact caused by a restriction of variance that can be explained by the positively selected population of medical students under investigation (learners in all conditions showed relatively high means in metacognitive control). However, we also cannot exclude that general validity problems of retrospective self-report scales (Artelt, 2000) can made responsible for not finding any significant effect of metacognition. 4. General discussion In the context of a web-based learning environment with complex case-based examples, effects of example and feedback format on diagnostic competence were investigated in two relevant domains of inner medicine (arterial hypertension and hyperthyroidism). In both studies, the effects of example and feedback format differed with respect to the aspect of diagnostic competence under consideration. In Study 1, erroneous examples had to be combined with elaborated feedback in order to compensate for the increased complexity and the higher cognitive load erroneous examples impose, at least with respect to the two more complex competence aspects, namely strategic and conditional knowledge. In Study 2, the change in the domain led to higher complexity of the learning material. Under these conditions, elaborated feedback became even more important, especially for enhancing conditional knowledge. In order to develop this competence aspect, deep conceptual understanding is necessary, and even advanced medical students need additional explanations to reach these high aims.

R. Stark et al. / Learning and Instruction 21 (2011) 22e33

As effects of our approach on diagnostic competence proved sustainable in another study (Kopp, Stark, & Fischer, 2008), the tested approach should be integrated into the regular clinical curriculum. However, as the pattern of results was rather complex and diverging in both studies, the conclusion can be drawn that the effectiveness of different example and feedback formats varies with respect to the complexity of the domain and the respective learning material. In correspondence with a lot of evidence in learning and instruction (Dochy et al., 1999), prior knowledge proved the strongest predictor for diagnostic competence, clearly stronger than time-on-task, and also than example and feedback format. In order to increase the effects of erroneous examples, the processing of erroneous diagnostic solutions and the respective (feedback-) explanations have to be intensified. Studies on example-based learning indicate that quality of self-explanation can be improved economically by simple prompting procedures (Renkl, 1997; Schworm & Renkl, 2007). Therefore in further studies, specific prompting procedures focussing on errors in the diagnostic process and their explanation should be tested. The second study indicated the importance of elaborated feedback in complex learning scenarios (Hattie & Timperley, 2007); however, this finding can also be interpreted in terms of the necessity of optional feedback procedures. Not all learners need fully elaborated feedback for each step in the diagnostic process. Therefore different stages of elaboration (for instance explanations from a clinical perspective with or without additional pathophysiological information) should be offered to the learners so that they can decide how much explanation they need in order to improve their understanding. The effectiveness of such optional feedback procedures should also be investigated in further studies. 4.1. Limitations of the study A limitation of the present study concerns the lack of measures of the mechanism that underlies the effects of example and feedback formats. It is important to learn more about schema induction in general and especially self-explanation processes mediating the effects of example and feedback format in the context of case-based worked example approaches. To this end, think-aloud protocols should be used, accompanied by interviews that can highlight the subjective perspective of the learners on their specific learning behaviour. In addition, aspects of metacognitive control should be assessed by methods relying less on retrospective self-reports. In further studies, it would be interesting to test the applicability of ambulant assessment methods focussing on metacognitive behaviour in the learning process (Fahrenberg, Myrtek, Pawlik, & Perrez, 2007). The applicability of the instructional approach that was implemented in the present study is neither limited to complex domains nor to domains in medical education. In the future the content information could be structured in a way that it is presented in the schematic and stepwise manner of the worked examples, so that effects of the presentation format can also be tested.

31

Acknowledgements The research for this article was funded by the German Research Association (Deutsche Forschungsgemeinschaft, DFG) (FI 720/2e1, STA 596/3e1).

Appendix A. Clinical situation presented in the learning environment with or without errors Your next patient is the 62 year old Mr Schneider. He is feeling tired and exhausted. He says he’s not able to climb stairs the way he used to anymore but says that he has no additional problems. He is able to sleep lying down at night and that he occasionally suffers from water retention in his legs during hot weather. Arterial hypertension, which he has had for many years, is his only pre-existing condition. No other cardiovascular risk factors. Treatment consists of prescription of a diuretic (hydrochlorothiazide 25 mg, once daily), a beta-blocker (metoprolol 50 mg, twice daily) and an ACE inhibitor (enalapril 5 mg, once daily). Mr Schneider reports that the blood pressure measurements he has been taking himself recently had increased to around 160/95 mmHg most days. He presented his carefully collected records. With errors As a working diagnosis, you assume a chronic biventricular heart failure due to a hypertensive coronary illness due to long-term primary arterial hypertension. You increase the ACE-inhibitor dosage to enalapril (10 mg, twice daily) and advise Mr Schneider to lose some weight. Because the case history constellation is so typical, you forgo any further tests and recommend to Mr Schneider to make an appointment for seeing you again in 3 months. Without errors As a working diagnosis, you assume a chronic biventricular heart failure due to a hypertensive coronary illness due to long-term primary arterial hypertension. But you also consider the possibility of secondary causes for the hypertension. You plan to examine the patient thoroughly and to have an ECG done afterwards.

Appendix B. Implementation of elaborated feedback After the presentation of the erroneous step Your considerations are correct but you commence treatment too quickly. In cases of performance loss and suspected heart failure careful physical examination is always required.

R. Stark et al. / Learning and Instruction 21 (2011) 22e33

32

After the presentation of the correct step Your considerations and subsequent steps are correct. Afterwards, the following information was provided in both conditions: The intermittent leg oedema support right-sided heart failure; the tiredness and exhaustion could be explained with left-sided heart failure with pulmonal congestion and exertional dispnoea. An ECG should be carried out as part of basic diagnostics, also to investigate the question of coronary heart disease and cardiac hypertrophy. Due to its high prevalence of about 90%, the cause is likely to be the so-called primary or essential arterial hypertension. In conducting a differential diagnosis, you should also especially consider secondary hypertension, anaemia, chronic kidney failure or a primary neuromuscular illness in cases of therapy-resistence under anti-hypertensive triple-drug treatment. In case of a lack of vegetative accompanying symptoms, pheochromocytoma would appear to be unlikely.

Appendix C. Item of the multiple-choice test measuring domain-specific conceptual knowledge Please choose the correct answer: Which of the following substances are typically excreted at an increased rate due to a pheochromocytoma? A: hydroxy indolic acid and hydroxyproline B: hydroxy indolic acid and catecholamines C: metanephrine and hydroxyproline D: catecholamines and metanephrine1 E: hydroxyproline and catecholamines

Appendix D. Key-feature problem measuring strategic knowledge Your patient at the clinic is Mrs. Meier, a 62-year-old smoker with pre-existing essential (primary) hypertension. She complains of high blood pressure for the past two months, despite using three anti-hypertensive medications as directed. Previous to this, her blood pressure had been well regulated. She would like to know what to do. Physical exam: RR 165/ 95 mmHG, height 163 cm, weight 74 kg, BMI: 27,8 with truncal obesity, no relevant previous illnesses. 1) Name one current working diagnosis (umbrella term is sufficient). 2) Please name two basic diagnostic tests that you will order. Her electrolyte laboratory result shows a normal low potassium level (3.6 mmol/l) and no other significant findings. Creatinine (0.9 mg/dl) und urea (18 mg/dl) levels are normal. 1

Correct response.

Urinalysis: normal. Abdominal ultrasound: kidneys within normal limits; relevant mass found in the area of the right adrenal gland (diameter 3 cm). 3) Please name three exams that you will order to verify whether the mass found is hormone producing and therefore causing secondary hypertension.

Appendix E. Problem-solving task measuring conditional knowledge After presenting a short case scenario: 1) Please name your working diagnosis and give your clinical reasoning. 2) You measure the blood pressure on the right arm while the patient is seated: 160/95 mmHg, HR 79/min. There are no abnormal findings on physical examination. The blood tests reveal a sodium of 140 mmol/l and a hypokalaemia of 3.2 mmol/l. The rest of the lab results including creatinine and urea are normal, as is the urinalysis. Please name causes for the hypokalaemia in this case, and the underlying pathomechanisms for each reason. 3) Abdominal ultrasound reveals a mass in the area of the right adrenal gland. You suspect primary hyperaldosteronism. Please name the diagnostic tests you would like to order and which results you expect from these tests and why.

References Al-Assaf, A. F., Bumpus, L. J., Carter, D., & Dixon, S. B. (2003). Preventing errors in healthcare: a call for action. Hospital Topics, 81, 5e12. Artelt, C. (2000). Wie pra¨diktiv sind retrospektive Selbstberichte u¨ber den Gebrauch von Lernstrategien fu¨r strategisches Lernen? [How predictive are retrospective self-reports concerning the use of learning strategies for strategic learning?] Zeitschrift Fu¨r Pa¨dagogische Psychologie, 14, 72e84. Atkinson, R. K., Derry, S. J., Renkl, A., & Wortham, D. (2000). Learning from examples: instructional principles from the worked examples research. Review of Educational Research, 70(2), 181e214. Balzer, W. K., Doherty, M. E., & O’Connor, R., Jr. (1989). Effects of cognitive feedback on performance. Psychological Bulletin, 106(3), 410e433. Bordage, G., Brailovsky, C., Carretier, H., & Page, G. (1995). Content validation of key features on a national examination of clinical decisionmaking skills. Academic Medicine, 70, 276e281. Boshuizen, H. P. A., Schmidt, H. G., Custers, E. J. F. M., & van de Wiel, M. W. (1995). Knowledge development and restructuring in the domain of medicine: the role of theory and practice. Learning and Instruction, 5, 269e289. Chi, M. T. H., Bassok, M., Lewis, M. W., Reimann, P., & Glaser, R. (1989). Self-explanations: how students study and use examples in learning to solve problems. Cognitive Science, 13, 145e182. Cohen, J. (1988). Statistical power analysis for the behavioral sciences. New York: Academic. Collins, A., Brown, J. S., & Newman, S. E. (1989). Cognitive apprenticeship: teaching the craft of reading, writing and mathematics. In L. B. Resnick (Ed.), Knowing, learning and instruction: Essays in honor of Robert Glaser (pp. 453e494). Hillsdale, NJ: Erlbaum.

R. Stark et al. / Learning and Instruction 21 (2011) 22e33 Cooper, G., & Sweller, J. (1987). Effects of schema acquisition and rule automation on mathematical problem-solving transfer. Journal of Educational Psychology, 79, 347e362. Curry, L. A. (2004). The effects of self-explanations of correct and incorrect solutions on algebra problem-solving performance. In K. Frobus, D. Gentner, & T. Regier (Eds.), Proceedings of the 26th annual conference of the cognitive society (pp. 1548). Mahwah, NJ: Erlbaum. Dempsey, J. V., Driscoll, M. P., & Swindell, L. K. (1993). Text-based feedback. In J. V. Dempsey, & G. C. Sales (Eds.), Interactive instruction and feedback (pp. 21e54). Englewood Cliffs, NJ: Educational Technology. Dochy, F., Segers, M., & Buehl, M. M. (1999). The relation between assessment practices and outcomes of studies: the case of research on prior knowledge. Review of Educational Research, 69, 145e186. Fahrenberg, J., Myrtek, M., Pawlik, K., & Perrez, M. (2007). Ambulantes assessment e Verhalten im Alltagskontext erfassen. Eine verhaltenswissenschaftliche Herausforderung an die Psychologie. [Ambulatory assessment e conduct capture in everyday context. A challenge to behavioral psychology]. Psychologische Rundschau, 58, 12e23. Fischer, M. R. (2000). CASUS e an authoring and learning tool supporting diagnostic reasoning. Zeitschrift Fu¨r Hochschuldidaktik, 1, 87e98. Flavell, J. H. (1984). Annahmen zum Begriff Metakognition sowie zur Entwicklung von Metakognition. [Assumptions to the term metacognition as well as to the development of metacognition]. In F. E. Weinert, & R. H. Kluwe (Eds.), Metakognition, Motivation und Lernen (pp. 9e21). Stuttgart, Deutschland: Kohlhammer. Gettinger, M. (1984). Individual differences in time needed for learning: a review of the literature. Educational Psychologist, 19, 15e29. Graber, M., Gordon, R., & Franklin, N. (2002). Reducing diagnostic errors in medicine: what’s the goal? Academic Medicine, 77, 981e992. Große, C. S., & Renkl, A. (2004). Learning from worked examples: what happens if errors are included? In P. Gerjets, J. Elen, R. Joiner, & P. Kirschner (Eds.), Instructional design for effective and enjoyable computersupported learning (pp. 356e364) Tu¨bingen, Germany: Knowledge Media Research Center. Große, C. S., & Renkl, A. (2007). Finding and fixing errors in worked examples: can this foster learning outcomes? Learning and Instruction, 17, 612e634. Hasselhorn, M. (1998). Metakognition. [Metacognition]. In D. Rost (Ed.), Handwo¨rterbuch Pa¨dagogische Psychologie (pp. 348e351). Weinheim, Deutschland: Beltz. Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81e112. Jacoby, J., Troutman, T., Mazursky, D., & Kuss, A. (1984). When feedback is ignored: disutility of outcome feedback. Journal of Applied Psychology, 69(3), 531e545. Kopp, V., Stark, R., & Fischer, M. R. (2008). Fostering diagnostic competence through computer-supported, case-based worked examples: effects of erroneous examples and feedback. Medical Education, 42, 823e829. Lan, W. Y., Bradley, L., & Parr, G. (1993). The effects of a self-monitoring process on college students’ learning in an introductory statistics course. The Journal of Experimental Education, 62, 26e40. Norman, G. R., & Schmidt, H. G. (2000). Effectiveness of problem-based learning curricula: theory, practice and paper darts. Medical Education, 34, 721e728. Oser, F., & Spychiger, M. (2005). Lernen ist schmerzhaft. Zur Theorie des negativen Wissens und zur Praxis der Fehlerkultur. [Learning is painful. On the theory of negative knowledge and the practice of error management]. Weinheim, Deutschland: Beltz.

33

Paas, F., & Kalyuga, S. (2005, May). Cognitive measurements to design effective learning environments. Paper presented at the International Workshop and Mini-conference on Extending Cognitive Load Theory and Instructional Design to the Development of Expert Performance. Heerlen, The Netherlands. Paas, F., & Van Gog, T. (2006). Optimising worked example instruction: different ways to increase germane cognitive load. Learning and Instruction, 16, 87e91. Paris, S. G., Lipson, M. Y., & Wixson, K. K. (1983). Becoming a strategic reader. Contemporary Educational Psychology, 8, 293e316. Pintrich, P. R., Smith, D. A. F., Garcia, T., & McKeachie, W. J. (1993). Reliability and predictive validity of the motivated strategies for learning questionnaire (MSLQ). Educational and Psychological Measurement, 53, 801e813. Renkl, A. (1997). Learning from worked-out examples: a study on individual differences. Cognitive Science, 21, 1e29. Renkl, A. (2005). The worked-out-example principle in multimedia learning. In R. Mayer (Ed.), Cambridge handbook of multimedia learning (pp. 229e246). Cambridge, UK: Cambridge University Press. Schworm, S., & Renkl, A. (2007). Learning argumentations skills through the use of prompts for self-explaining examples. Journal of Educational Psychology, 99(2), 285e296. Smith, P. L. (1988). Toward a taxonomy of feedback: Content and scheduling. Paper presented at the Annual Meeting of the Association for Educational Communications and Technology, New Orleans, Louisiana. Stark, R. (1999). Lernen mit Lo¨sungsbeispielen. Einfluß unvollsta¨ndiger Lo¨sungsbeispiele auf Beispielelaboration, Lernerfolg und Motivation. [Learning by worked examples. Effects of incomplete examples on exampleelaboration, learning outcomes and motivation]. Go¨ttingen, Deutschland: Hogrefe. Stark, R., Gruber, H., Mandl, H., & Hinkofer, L. (2001). Wege zur Optimierung eines beispiel-basierten Instruktionsansatzes: der Einfluss multipler Perspektiven und instruktionaler Erkla¨rungen auf den Erwerb von Handlungskompetenz. [Ways of optimizing an example-based instructional approach: influence of multiple perspectives and instructional explanations on the acquistion of action competence]. Unterrichtswissenschaft, 29, 26e40. Stark, R., Tyroller, M., Krause, U.-M., & Mandl, H. (2008). Effekte einer metakognitiven Promptingmaßnahme beim situierten, beispielbasierten Lernen im Bereich Korrelationsrechnung. [Effects of a prompting intervention in situated example-based learning in the domain of correlation]. Zeitschrift fu¨r Pa¨dagogische Psychologie, 22(1), 59e71. Sweller, J. (1988). Cognitive load during problem solving: effects on learning. Cognitive Science, 12, 257e285. Sweller, J., & Cooper, G. (1985). The use of worked examples as a substitute for problem solving in learning algebra. Cognition and Instruction, 2(1), 59e89. Sweller, J., van Merrie¨nboer, J. J. G., & Paas, F. (1998). Cognitive architecture and instructional design. Educational Psychology Review, 10, 251e296. Van Gog, T., Paas, F., & Van Merrie¨nboer, J. J. G. (2008). Effects of studying sequences of process-oriented and product-oriented worked examples on troubleshooting transfer efficiency. Learning and Instruction, 18, 211e222. VanLehn, K. (1999). Rule-learning events in the acquisition of a complex skill: an evaluation of CASCADE. The Journal of the Learning Sciences, 8, 71e125. Zimmerman, B. J. (2000). Attaining self-regulation: a social cognitive perspective. In M. Boekaerts, P. R. Pintrich, & M. Zeidner (Eds.), Handbook of self-regulation (pp. 13e35). San Diego, CA: Academic.