Using heuristic worked examples to promote inquiry-based learning

Using heuristic worked examples to promote inquiry-based learning

Learning and Instruction 29 (2014) 56e64 Contents lists available at ScienceDirect Learning and Instruction journal homepage: www.elsevier.com/locat...

803KB Sizes 0 Downloads 76 Views

Learning and Instruction 29 (2014) 56e64

Contents lists available at ScienceDirect

Learning and Instruction journal homepage: www.elsevier.com/locate/learninstruc

Using heuristic worked examples to promote inquiry-based learning Yvonne G. Mulder*, Ard W. Lazonder, Ton de Jong University of Twente, Department of Instructional Technology, P.O. Box 217, 7500 AE Enschede, The Netherlands

a r t i c l e i n f o

a b s t r a c t

Article history: Received 4 September 2012 Received in revised form 23 August 2013 Accepted 26 August 2013

Inquiry learning can be facilitated by having students investigate the domain through a computer simulation and express their acquired understanding in a runnable computer model. This study investigated whether heuristic worked examples can further enhance students’ inquiry behaviour, the quality of the models they create, and their domain knowledge. High-school students were offered a simulation of an electrical circuit and a modelling tool. Students in the experimental condition (n ¼ 46) could consult heuristic worked examples that explained what activities were needed and how they should be performed. Students in the control condition (n ¼ 36) did not receive this support. Cross-condition comparisons confirmed that heuristic worked examples improved students’ inquiry behaviour and enhanced the quality of their models. However, few students created a model that reflected full understanding of the electrical circuit, and the expected between-group difference in posttest scores failed to appear. Based on these findings, improvements to the design of heuristic worked examples are proposed. Ó 2013 Elsevier Ltd. All rights reserved.

Keywords: Inquiry learning Model-based learning Model progression Worked examples

1. Introduction Recent meta-analyses have concluded that inquiry learning can benefit students and can lead to superior student performance than more direct forms of instruction (Alfieri, Brooks, Aldrich, & Tenenbaum, 2011; Minner, Levy, & Century, 2010). However, these meta-analyses also suggest that these benefits only hold when students are supported during their inquiry activities. This support is needed to compensate for students’ modest inquiry skills, their prior knowledge deficits, or both. De Jong and van Joolingen’s (1998) review revealed a broad variety of skill deficiencies in simulation-based inquiry learning. When students learn about phenomena through systematic experimentation with a simulation, they are generally unable to infer hypotheses from data, design conclusive experiments, engage in efficient experimentation behaviour, and attend to incompatible data. Similar problems arise when students engage in scientific modelling (hereafter: modelling) to create computer models of their understanding of scientific phenomena. Hogan and Thomas (2001), for example, noticed that students often fail to engage in dynamic iterations between examining output and revising models, and Stratford, Krajcik, and Soloway (1998) observed a lack of persistence in debugging models to fine-tune their performance.

* Corresponding author. Tel.: þ31 53 489 4857. E-mail address: [email protected] (Y.G. Mulder). 0959-4752/$ e see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.learninstruc.2013.08.001

Mulder, Lazonder, and de Jong (2010) examined whether these results generalize to a learning task where simulation-based inquiry and modelling are combined (cf. Basu, Dickes, Kinnebrew, Sengupta, & Biswas, 2013; van Joolingen, de Jong, Lazonder, Savelsbergh, & Manlove, 2005). This combined approach enabled students to learn about a scientific phenomenon by experimenting with a simulation. Once students had developed an initial understanding of the phenomenon, they built a runnable model to express their knowledge. This model can be thought of as a set of hypotheses students can test by running the model and checking its output against data from the simulation. Based on this evaluation students can refine their understanding through additional experimentation with the simulation and further revision of their model. Mulder et al. found that domain novices are quite capable of identifying which variables to include in their models, but have difficulty inferring how these variables are related. Instead of working step-by-step toward a full-fledged scientific equation to specify a relationship, novices tried to induce and model these equations from scratch, which proved to be ineffective given their lack of prior domain knowledge. These findings suggest that students could benefit from support that prevents them from ‘jumping the gun’ and that better attunes their inquiry and modelling activities to their level of domain knowledge (cf. Quintana et al., 2004). This support can be offered in a non-intrusive way by organizing the learning task according to a simple-to-complex sequence that matches the students’ increasing levels of domain understanding.

Y.G. Mulder et al. / Learning and Instruction 29 (2014) 56e64

This type of task structuring was first introduced by White and Frederiksen (1990), who termed it ‘model progression’. Model progression was found to lead to higher performance success in some studies (Alessi, 1995; Eseryel & Law, 2010; Rieber & Parmley, 1995; Swaak, van Joolingen, & de Jong, 1998), but other studies report less favourable results (de Jong et al., 1999; Quinn & Alessi, 1994). These differential effects might be attributable to the use of slightly different configurations of the simple-to-complex sequencing. Some studies introduced students to all of the learning content at once and engaged them in increasingly specific reasoning about the task content (i.e., model order progression) whereas students in other studies engaged in specific reasoning from the start and were confronted with increasingly elaborate task content (i.e., model elaboration progression). Mulder, Lazonder, and de Jong (2011) implemented both types of model progression in a simulation-based inquiry and modelling task about the charging of a capacitor in an electrical circuit. Both types divided the task into three successive phases, but differed with regard to the sequencing principle that determined how task complexity increased across these phases. Model order progression, the predicted optimal variant, gradually increased the specificity of the relations between variables. In Phase 1, students had to identify all relevant variables and relations and sketch the model outline. In Phase 2, they had to indicate a general direction of the effect for these relations, and in Phase 3 they had to specify these relationships quantitatively in the form of an equation. Model elaboration progression, by contrast, gradually expanded the number of variables in the task. Students had to investigate and model an electrical circuit with a voltage source and one light bulb in Phase 1. An additional light bulb was introduced in Phase 2, and a capacitor was added in Phase 3. Students who were supported by either type of model progression outperformed students from an unsupported control condition. A comparison between the two model progression variants further showed that students in the model order group outperformed those from the model elaboration group on the construction of relations in their models. However, in this study even students in the best-performing model progression group produced mediocre models. In a followup study, attempts to optimize model progression also failed to substantially improve students’ performance (Mulder, Lazonder, de Jong, Anjewierden, & Bollen, 2012). Unfortunately, it is not uncommon that scaffolding has little success in enhancing what students learn from modelling tasks. For instance, the Manlove, Lazonder, and de Jong (2009) studies showed that students often do not take full advantage of the support offered by regulative scaffolds, which causes their performance to remain somewhat poor. Likewise, Roscoe, Segedy, Sulcer, Jeong, and Biswas (2013) provided students with hints that offered content feedback. Although these hints were positively associated with students’ performance, students gradually came to rely on this tool. This was considered a shallow strategy development, as it negatively impacted the efficacy of the learning task. As such, offering direct support has the risk of affecting students’ learning activities, but not their learning outcomes. In a recent review, VanLehn (2013) thus argues that scaffolds for learning should guide students through the learning process instead of providing only content feedback. Hence, students might benefit from a more explicit account of what the activities in each model progression phase entail and how they should be performed. Such support could take the form of worked examples, which have proved to be a fruitful means to enhance problem-solving performance (e.g., Atkinson, Derry, Renkl, & Wortham, 2000; Sweller & Cooper, 1985). Worked examples essentially include a problem statement, a step-by-step account of the procedure to

57

solve the problem, and the final solution. Worked examples have traditionally been applied to well-structured problems that have a straightforward, algorithmic solution process. Research has shown that studying a series of worked examples, either to prepare for or instead of problem-solving practice, is more effective than conventional, unsupported problem solving (see, for a review, Atkinson et al., 2000; Sweller, Ayres, & Kalyuga, 2011). Other studies have tried to optimize the presentation and use of worked examples. To minimize shortcomings such as only rote recall of the information, worked example instruction can be enhanced by eliciting self-explanations (Atkinson, Renkl, & Merrill, 2003; Chi, Bassok, Lewis, Reimann, & Glaser, 1989), presenting the rationale behind the presented solution (van Gog, Paas, & van Merriënboer, 2008), or offering meta-level feedback (Moreno, Reisslein, & Ozogul, 2009), However, the effectiveness of problem-solving support methods does not necessarily generalize to inquiry learning tasks. Inquiry and modelling are iterative processes in which the scientific reasoning skills of hypothesizing, experimenting, and evaluating evidence are performed repeatedly. The nature of the hypotheses, the way they are examined, and the outcomes of these investigations all determine what would be the next logical step in order to induce and model the characteristics of the phenomenon at hand (Klahr & Dunbar, 1988; White, Shimoda, & Frederiksen, 1999). Capturing this complex cognitive activity in a fixed, algorithmic sequence of action steps would neither be possible nor do justice to the true nature of the inquiry and modelling processdand would therefore presumably cause students to develop a limited understanding of the task content. Hilbert and colleagues acknowledged this limitation of traditional worked examples, and proposed a variant that can be applied in non-algorithmic problem-solving situations (Hilbert & Renkl, 2009; Hilbert, Renkl, Kessler, & Reiss, 2008). These so-called heuristic worked examples do not emphasize the specific action sequence students should follow to solve a problem, but exemplify the heuristic reasoning underlying the choice and application of this action sequence. This shift in focus has broadened the application of worked examples from well-structured, algorithmic problem-solving tasks to more ill-structured, and hence more complex learning tasks. Recent reviews of worked-examples research have demonstrated that heuristic worked examples can be applied effectively in a variety of domains such as mathematical proofs, concept mapping, and second language learning (Renkl, Hilbert, & Schworm, 2009; Sweller et al., 2011). Heuristic worked examples also hold promise for supporting students’ inquiry and modelling activities. Both processes are iterative by nature and require students to consider previously performed activities and results in order to decide which actions to perform next. These decisions have been found to be problematic because students have an insufficient understanding of the inquiry and modelling process (Mulder et al., 2011). Heuristic worked examples could help alleviate this problem by exemplifying these processes (i.e., hypothesis generation, experimentation, and evidence evaluation) and showing the heuristic reasoning for cycling through these processes effectively. As the design of informative simulation experiments is challenging for students (de Jong & van Joolingen, 1998), explicit attention was given to the design of unconfounded experiments using the Control-of-Variables Strategy (CVS; Chen & Klahr, 1999). The heuristic worked examples also show how the interpretation of data from these experiments can subsequently lead to an (initial) understanding of the phenomenon, which can then be represented and tested in a model. In this way, students are shown how to set up systematic experiments with the simulation, and how modelling can be integrated into the inquiry process.

58

Y.G. Mulder et al. / Learning and Instruction 29 (2014) 56e64

Fig. 1. Screen capture of the simulation tool (left panel) and the model editor tool with the reference model (right panel).

2. The present study and hypotheses The purpose of the present study was to determine the instructional efficacy of heuristic worked examples in an inquiry learning environment with modelling facilities. The study utilized a between-subject design with two conditions. The learning environment in both conditions was designed in accordance with Mulder et al.’s (2011) implementation of model order progression, so that students had to build models of increasingly greater specificity in three consecutive phases. In the experimental condition, heuristic worked examples were available for each model progression phase. These examples demonstrated the heuristic strategies students should apply in choosing and performing their actions. Students in the control condition received no such support. Five hypotheses were investigated. Since explicit attention was given to the design of unconfounded experiments using the Control-of-Variables Strategy in the heuristic worked examples, it was predicted that the heuristic worked examples would induce students to conduct more experiments (Hypothesis 1) and that a larger proportion of these experiments would be unconfounded and thus vary only one variable at a time (Hypothesis 2). Note that the heuristic worked examples showed appropriate integration of the simulation, model, and data-interpretation activities, demonstrating that students should always interpret the data from their simulation experiments prior to translating their hypotheses into a model. Therefore, Hypothesis 3 predicted that the worked examples would influence students’ integration of simulation, model, and dataeinspection activities. More specifically, the students having worked examples were expected use the data inspection tool more often after experimenting with the simulation then their control counterparts. Hypothesis 4 predicted that students who were supported with heuristic worked examples would create better models than students who were not supported with heuristic worked examples. Finally, Hypothesis 5 predicted that students from the worked examples condition would acquire more domain knowledge, as evidenced by their posttest scores, than students from the control condition. 3. Method 3.1. Participants Participants were 107 Dutch high school students from a science track. The average age of the students was 15.51 years (SD ¼ 0.42). A review of school curricula showed that the charging of capacitors, which was the topic of inquiry, had not yet been taught in these students’ physics classes. The students’ teachers confirmed that this was the case. A prior knowledge test (see Section 3.2.3) was

administered to substantiate that participants were indeed domain novices. Participant matching to conditions occurred based on class-ranked prior knowledge test scores. Student absenteeism reduced the original sample to 92 students who were present during the introductory session (worked examples: n ¼ 48; control: n ¼ 44), of whom 82 students completed the task (worked examples: n ¼ 46; control: n ¼ 36), and 77 students also completed the posttest (worked examples: n ¼ 43; control: n ¼ 34). The latter two sample reductions were due to student illness, and unforeseen schedule changes in one of the classrooms. 3.2. Materials 3.2.1. Inquiry task and learning environment All participants worked on an inquiry task about the charging of a capacitor in an electrical circuit. This topic lends itself well to System Dynamics modelling (the charging of a capacitor is a process that changes over time), and has been successfully used in learning tasks that incorporate model order progression (White & Frederiksen, 1990). The students’ assignment was to examine and model the influence and interactions of each element in the electrical circuit presented in a simulation. Participants performed this task within a modified stand-alone version of the Co-Lab learning environment (van Joolingen et al., 2005). The learning environment housed a simulation tool which represented an electrical circuit containing a voltage source, two light bulbs, and a capacitor (see Fig. 1, left panel). Participants could experiment with this simulation to find out how these components behave. When they had figured out the model underlying the simulation, participants could use the model editor tool to represent their ideas in a runnable system dynamics model. As shown in Fig. 1 (right panel), these models have a graphical structure that consists of variables and relations. Variables are the constituent elements of a model and relations define how two or more variables interact. Students could experiment with both the simulation and the model editor. The output of these experiments could be analysed through a bar chart, table, or graph tool. An embedded help file contained the assignment and offered explanations of the operation of the tools in the learning environment. The help files contained no domain information on electrical circuits and capacitors. Model order progression was implemented by dividing the modelling task into three successive phases. In Phase 1, students had to indicate the model elements (variables) and which ones affected which others (relationships) e but not how they affected them. In Phase 2, students had to provide a qualitative specification of each relationship (e.g., if resistance increases, then current decreases). In Phase 3, students had to specify each relationship quantitatively in the form of an equation (e.g., I ¼ V/R). As students

Y.G. Mulder et al. / Learning and Instruction 29 (2014) 56e64

should only progress to the next phase after they had achieved a reasonable understanding, returning to previous phases was deemed unnecessary and therefore prohibited. 3.2.2. Worked examples Heuristic worked examples (hereafter: worked examples) were designed for each model progression phase. Research on the presentation format of worked examples advocates presenting examples as an action sequence over time in order to foster student learning (Lewis & Barron, 2009; Lusk & Atkinson, 2007). The worked examples in the present study therefore came in the form of an annotated streaming video that contained a dynamic screen capture of a person performing an inquiry and modelling task (see Fig. 2 for an example of the videos). This task was situated in a different, yet familiar context: the inflow and outflow of money. This context was chosen because it was known to the students (it was also used in the introductory session to familiarize them with

Fig. 2. Screen captures from a heuristic worked example video that shows CVS information (a), shows a series of experiments with a simulation of the example domain varying only one variable at a time (b), and provides an interpretation of the results for the experiments (c).

59

the learning environment), and familiarity with the domain used has been found to be pivotal for skill acquisition from worked examples (Renkl et al., 2009). Seven worked examples were created: one general introductory example to reacquaint students with the domain used, and two specific examples for each model progression phase. The latter examples demonstrated the inquiry processes of hypothesis generation, experimentation, and evidence evaluation, as well as the heuristic strategies students should apply to cycle through these processes effectively. In each model progression phase, one worked example displayed these processes and strategies for the students’ inquiry activities with the simulation; a second example concerned the use of these processes and strategies during modelling. Both worked examples together showed how to coordinate simulation, model, and dataeinspection activities. More specifically, the simulation worked example for Phase 1 demonstrated how students could experiment with the simulation to identify relevant variables and find out which variables are related. The modelling example for this phase built on this information by demonstrating how variables can be created and linked in a model sketch, and how feedback based on this sketch leads to new simulation experiments, new data, and refinements to the model. Likewise, the simulation example in Phase 2 displayed how students could induce the nature of the relationship in their model from simulation experiments; the modelling example explained how the newly-discovered relationships are incorporated in a qualitative model. In Phase 3, the simulation example demonstrated the reasoning involved in inferring equations, and the modelling example displayed how these equations can be included and tested in a quantitative model. The worked examples were presented on a website that was only available to students in the experimental condition. All seven worked examples were accessible during the entire experimental session, regardless of the model progression phase a student was in. The names of the worked examples reflected their content (e.g., “Phase 1, simulation”) to inform students as to which worked example was relevant to them at that moment. Participants’ interactions with the website’s movie player that showed the worked example videos (e.g., pressing the play and stop button) were stored in a log file. 3.2.3. Knowledge tests Two tests were used to assess participants’ knowledge of electrical circuits: a prior knowledge test and a posttest containing 8 and 14 items, respectively. In the prior knowledge test, four openended questions addressed the meaning of key domain concepts (i.e., voltage source, resistance, capacitor, and capacitance). For example: “State the function of a capacitor in an electrical circuit”. Four additional open-ended questions addressed the physics equations that govern the behaviour of charging a capacitor (i.e., Ohms law, Kirchoff’s law; (including its two rules: the junction rule and the loop rule), and the behaviour of capacitors). For example: ”Ohm’s law describes the relationship between voltage, current and resistance. What is the formula for Ohm’s law?”. As performance on the prior knowledge test was expected to be low, three simple filler items on the interpretation of numerical data were added to sustain students’ motivation during the test. These filler items were left out of the analysis. The posttest aimed to assess learning outcomes and differed from the prior knowledge test in two respects. First, the eight items from the prior knowledge test were rephrased in modelling terms in order to establish maximum resemblance with the learning task. Second, to ensure that the posttest covered the contents of all three model progression phases, six multiple-choice items were added to gauge students’ qualitative understanding of the task. For example,

60

Y.G. Mulder et al. / Learning and Instruction 29 (2014) 56e64

an item for Phase 2 displayed a model structure showing that element I is influenced by elements R and Vr (see Fig. 1 for a visual representation), and asked students to select the correct qualitative specification of these relations (e.g., “if resistance increases, then current decreases”) 3.3. Procedure All participants engaged in three sessions that were originally scheduled over the course of one week. However, due to organizational difficulties, one class (15 students) had a three-week break between the first and second session. Their second session was therefore preceded by a 10-minute (extra) recapitulation of the first session’s activities. As student assignment to experimental conditions occurred within each classroom, the break could not have resulted in between-condition differences. Analysis of students’ learning and performance scores confirmed the similarity of the classrooms (model quality, variable aspect: F(3, 78); ¼ 1.39, p ¼ .251, h2p ¼ .06; model quality, relations aspect: F(3, 78) ¼ 2,58, p ¼ .060 h2p ¼ .09; posttest score: F(3, 73) ¼ 1.04, p ¼ .378 h2p ¼ .04). During the introductory session, participants first completed the prior knowledge test. Then they received a guided tour of the CoLab learning environment, and finally completed a brief tutorial that familiarized them with the system dynamics modelling language and the operation of the modelling tool. During the second session (i.e., experimental session), students from both conditions worked with the learning environment. Students worked individually on the task and could consult the experimenter only for technical assistance. The session started with a brief reminder for both conditions that the students would work in a learning environment where the assignment was split into phases. The students were told that they could progress through these phases at their own pace, but could not return to a previous phase. They were encouraged to progress through all three phases. Students in the experimental condition were also instructed to access a website where they could watch the worked example videos. Students from the control condition had no access to these videos. After the instructions, students from both conditions worked on the assignment for approximately 90 minute. They could stop ahead of time if they had completed their assignment. In the final session all students completed the posttest. Students were not told in advance that they were to take this test, to prevent them from preparing for the test in between sessions, and hence, to increase the likelihood that posttest scores would represent the knowledge gained during the experiment. 3.4. Coding and scoring Variables under investigation were collected during the experimental session to indicate students’ performance in the learning environment, and during the final session where students’ learning was assessed by a posttest. Students’ performance measures were taken from log files and concerned time on task, experimentation behaviour, navigation patterns, and model quality. Time on task measured the duration of the experimental session, and was differentiated over the three phases to show students’ advancement through these model progression phases. Interactions with the simulation and model tools were analysed to assess students’ experimentation behaviour. The instances where participants clicked the “Start” button in the model editor (model experiment) or simulation (simulation experiment) were retrieved from the log files. These instances were classified as either “unique” or “duplicated” depending on whether the experiment had been previously run with the same values. The number of unique model experiments and unique simulation experiments

indicated the comprehensiveness of students’ experimentation behaviour. Additionally, each unique simulation experiment was compared to subsequent experiments to assess whether one or more than one input values had been changed. A CVS score was calculated to indicate the percentage of unconfounded experiments in which only one value was changed. Students’ use of the three types of tools in the learning environment was assessed to examine possible differences in navigation patterns. All students could engage in activities with the simulation, the model editor, and the data inspection facilities (i.e., bar chart, table, and graph). The sequence in which these tools were accessed was stored in the log files. These data were used to count how often students switched between any set of tools. Based on these frequencies, proportions were calculated to indicate the likelihood of moving from one tool to the other. Model quality scores were assessed for the participants’ final models by a software agent, as described by Anjewierden (2012) and using the rubric from Manlove, Lazonder, and Jong (2009). The resulting score represents the number of correctly specified variables and relations in the models. “Correctness” was judged in relation to the reference model (see Fig. 1). One point was awarded for each correctly named variable; an additional point was given if that variable was of the correct type (e.g., two points would be awarded for the ‘Charge’ element in Fig. 1 if it was correctly represented as a Stock variable). Concerning relations, one point was awarded for each correct link between two variables. Up to three additional points could be earned if the direction, nature (i.e., a qualitative specification), and magnitude of effect (i.e., a quantitative specification) of the relation was correct. To illustrate, a correct representation of Ohm’s law for the element I in Fig. 1 would be awarded 8 points: 4 points for the quantitative specifications of both the VreI and ReI relations. The maximum model quality score was 54. The rubric’s inter-rater reliability for variables (Cohen’s k ¼ .74) and relations (Cohen’s k ¼ .92) were considered to be sufficient. Participants’ answers to the prior knowledge test were scored using a rubric that allotted one point for each correct response. The maximum prior knowledge test score was 8. The Cohen’s k interrater reliability of this rubric was assessed by Mulder et al. (2011) and reached .89. To ensure that the present coding would be of similar quality, the same coder scored the prior knowledge tests. She attended a practice session to reacquaint herself with the coding scheme. Learning outcomes were indicated by students’ posttest scores. A rubric was developed to score participants’ answers for the 14 items, and one point was allotted for each correct response. Separate scores were computed for conceptual items (maximum 4 points), qualitative items (maximum 6 points) and formula items (maximum 4 points). Following a brief training session, two raters used this rubric to score the open-ended questions for a randomly selected set of 20 students: the overall Cohen’s k inter-rater reliability was .96 (concepts: .90; qualitative: .98; formula: 1.00) with a range from .70 to 1.00 for individual items. The instances where the two raters differed were discussed and clarified prior to the actual coding of all posttests by one of the raters. 4. Results Table 1 summarizes the descriptive statistics for participants’ performance by condition. Univariate analysis of variance (ANOVA) revealed no significant differences in prior knowledge between the two conditions, F(1, 80) ¼ 0.03, p ¼ .866, h2p ¼ .00. Most students from the experimental condition (n ¼ 40) viewed at least one worked example. For further analysis, it was determined whether the students in the worked example condition

Y.G. Mulder et al. / Learning and Instruction 29 (2014) 56e64 Table 1 Summary of participants’ performance. Worked examples

Control

n

n

M

SD

M

SD

Prior knowledge test score

46

1.48

1.28

36

1.53

1.36

Time on task (min) Phase 1 Phase 2 Phase 3 Total

46 33 14 46

67.45 16.18 4.69 80.48

16.82 16.03 5.79 9.85

36 20 15 36

64.72 16.77 10.11 78.25

21.03 15.67 8.29 14.21

Model quality Variables Relations

46 46

10.46 8.63

3.88 5.20

36 36

7.39 5.53

2.98 5.41

Posttest score Concepts Qualitative Formula

43 43 43 43

2.84 1.50 1.22 0.11

2.02 1.09 1.19 0.32

34 34 34 34

2.97 1.42 1.42 0.11

1.70 0.91 1.11 0.40

Experimentation behaviour Unique model experiments Unique simulation experiments Use of CVSa

46 46 46

47.85 15.04 77.00

24.64 9.56 12.57

36 36 36

42.86 9.64 67.76

38.37 7.05 24.59

a

Percentage of unconfounded simulation experiments.

who did not view any of the examples were outliers on the dependent variables. Additionally it was determined that excluding these students from the analyses did not change any of this study’s results. The students who did view the worked examples viewed them 10 times on average (SD ¼ 6.42). Worked-example movies for Phase 1 were viewed most often (M ¼ 6.28; SD ¼ 3.99); the movies for Phase 2 (M ¼ 2.75; SD ¼ 2.33) and Phase 3 (M ¼ 1.08; SD ¼ 1.53) were viewed less often. The students in the worked example condition watched the movies for a total of 13 min on average (SD ¼ 12). A one-sample t-test showed that the mean number of worked example views differed significantly from zero, t(45) ¼ 8.64, p < .001, h2p ¼ .79, which confirmed the difference in treatment across the two conditions. Despite this additional effort by students in the worked example condition, the overall time on task displayed in Table 1 was comparable in both conditions, F(1, 80) ¼ 0.70, p ¼ .404, h2p ¼ .01, as was the time on task per phase (Phase 1: F(1, 80) ¼ 0.43, p ¼ .516, h2p ¼ .01; Phase 2: F(1, 51) ¼ 0.02, p ¼ .896, h2p ¼ .01; Phase 3: F(1, 27) ¼ 4.11, p ¼ .053, h2p ¼ .06). However, the sample size indicators in Table 1 show that 72% of the worked example students progressed from the first to the second phase and 30% reached the third phase. In the control condition, 56% of the students progressed from the first to the second phase and 42% reached the third phase. Binomial tests revealed that the percentage of students who entered Phase 2 was significantly higher in the worked example condition, z ¼ 2.06, p ¼ .037, h2p ¼ .05, whereas the percentage of students who continued to Phase 3 was higher in the control condition, z ¼ 4.12, p < .001, h2p ¼ .32. This means that students in the worked example condition were more likely to reach the second phase than those in

61

the control condition, who in turn were more likely to continue on to the third phase. The worked examples demonstrated the heuristic strategies students should apply when choosing and performing their actions. Students’ experiments with the model and simulation were examined to test our hypothesis that the students in the worked example condition would conduct more experiments. Table 1 shows the number of unique experiments performed with each tool. Using Pillai’s trace, a MANOVA produced a significant effect for condition on the number of simulation and model experiments, V ¼ .09, F(2, 79) ¼ 4.11, p ¼ .020. Subsequent univariate ANOVAs, testing Hypothesis 1, revealed that students in the worked example condition performed as many unique model experiments, F(1, 80) ¼ .51, p ¼ .477, h2p ¼ .01, but significantly more unique simulation experiments, F(1, 82) ¼ 8.06, p ¼ .006, h2p ¼ .09, compared to students from the control condition. To test Hypothesis 2, students’ ability to set up systematic comparisons (as indicated by the CVS scores in Table 1) was analysed by ANOVA. As not all students performed multiple unique simulation experiments, one student in the worked example condition, and five from the control condition were left out of this analysis. Results confirmed that the worked examples positively affect students’ systematic experimentation, F(1, 74) ¼ 4.63, p ¼ .035, h2p ¼ .06. Hypothesis 3 concerned the integration of simulation, model, and dataeinspection activities. Fig. 3 shows the likelihood of students going from one tool to the other by condition. As these proportions in pairs always add up to 1, they were analysed by a MANOVA, using only one of the proportions for each tool as the dependent variable. Results confirmed that the worked examples influenced students’ navigation patterns, V ¼ .24, F(3, 78) ¼ 8.34, p < .001. Subsequent univariate ANOVAs were performed to test Hypothesis 3. Following the simulation activities, students who had worked examples were more likely to inspect their data e instead of going straight to their model e than the control students, F(1, 80) ¼ 13.57, p < .001, h2p ¼ .15. The worked examples also significantly influenced the activities following data inspection, F(1, 80) ¼ 11.35, p ¼ .001, h2p ¼ .12. All students tended to access the model editor after data inspection, but less often in the worked examples condition. Modelling activities were generally followed by data inspection and no significant differences between conditions were found for this tool, F(1, 80) ¼ 2.64, p ¼ .108, h2p ¼ .03. Hypothesis 4 predicted that the worked examples enhance students’ model quality. A MANOVA showed a significant effect for condition, V ¼ 0.16, F(2, 79) ¼ 7.65, p ¼ .001. Subsequent univariate ANOVAs revealed significant worked example effects on both the variables, F(1, 80) ¼ 15.39, p < .001, h2p ¼ .16, and the relations aspect, F(1, 80) ¼ 6.94, p ¼ .010, h2p ¼ .11. Hypothesis 5 concerned students’ learning outcomes. A MANOVA on the mean posttest scores in Table 1 revealed no significant difference between the two conditions, V ¼ 0.72, F(3, 78) ¼ 0.34, p ¼ .797. For the experimental condition, the

Fig. 3. Navigation patterns among the tools in the learning environment. Scores represent the likelihood to move from one tool to the other. Data for the worked example condition appears in boldface; data for the control condition in italics.

62

Y.G. Mulder et al. / Learning and Instruction 29 (2014) 56e64

Table 2 Summary of regression analysis for variables predicting posttest scores. Predictor variable

B

SE

b

Unique simulation experiments Relations aspect model quality

0.05 0.12

0.02 0.04

0.23 0.29

relationship between posttest scores and the number of worked examples viewed was examined. The positive correlation, r ¼ .25, supports the presumption that students who viewed the worked examples more often also performed better on the posttest to a nearly significant degree, p ¼ .056. Not surprisingly, the number of worked examples viewed was strongly related to the total time spent viewing the worked examples, r ¼ .76, p < .001. However, the total time spent viewing these movies was not significantly related to posttest performance, r ¼ .06, p ¼ .305. An exploratory analysis investigated whether experimentation behaviour, navigation patterns, and model quality contributed to differences in posttest scores. Multiple regression produced a significant model, F(2, 68) ¼ 5.55, p ¼ .006, r ¼ .37, that accounted for 14% of the variance in posttest scores. The unstandardized regression coefficients shown in Table 2 indicate that the posttest scores were positively associated with the number of unique simulation experiments, and negatively related to the correct number of relations in the models. The other variables of experimentation behaviour, navigation patterns and model quality did not make a significant contribution to the model, and were excluded from the model due to the Forward method that was applied in this exploratory analysis. 5. Discussion This study compared the performance and learning outcomes of students who either were or were not supported by heuristic worked examples that explained what the activities in each model progression phase entail and how they should be performed. The worked examples were expected to enhance students’ experimentation behaviour (Hypothesis 1 and 2), integration of simulation, model, and dataeinspection activities (Hypothesis 3), model quality (Hypothesis 4) and acquisition of domain knowledge (Hypothesis 5). Data indicated that students spent a substantial amount of time viewing the worked examples, suggesting that they appreciated the additional instruction on how to coordinate and perform the inquiry and modelling activities. Hypothesis 1 was partially confirmed by a between-group comparison of students’ experimentation behaviour. Students from the experimental condition conducted more unique simulation experiments than students from the control condition, suggesting that the worked examples increased coverage of the experiment space (Klahr & Dunbar, 1988). Regression analysis further showed that this measure is positively associated with performance on the posttest. Obviously, the worked examples did not influence the number of unique model experiments. As building a model requires constant refining of the model, it is unlikely that students performed the same model experiment repeatedly. Therefore each model was probably a unique model experiment for students in both conditions. Hypothesis 2 predicted that the worked examples would enhance systematic experimentation with the simulation, which was another indicator of the quality of students’ experimentation behaviour. This hypothesis was confirmed, as the students who had worked examples performed more unconfounded experiments than their control counterparts. However, this measure did not make a significant additional contribution to posttest performance.

One possible explanation could be that systematic experimentation is a prerequisite for effective simulation-based inquiry learning, but is not the only skill needed for knowledge construction to occur. For instance, incorrect data interpretation can completely undo the effects of systematic experimentation. As predicted by Hypothesis 3, the worked examples influenced students’ integration of simulation, model and data inspection activities. The worked examples significantly increased the likelihood that students inspected their data after experimenting with the simulation and before going to their model. The students who had worked examples were also more likely to go to the simulation following data inspection. This is, of course, in line with the higher number of simulation experiments, as effective simulation experimenting would require going back and forth between the simulation and the data-interpretation tool. No between-group differences were found with regard to tool use following modelling activities. Students in both conditions usually inspected their data following model activities, but they sometimes consulted the simulation tool instead. Results for Hypothesis 4 indicate that the worked examples improved not only students’ inquiry behaviour, but also their performance. This conclusion is based on the fact that students in the experimental condition created better models than the control students. Unfortunately, though, Hypothesis 5, which predicted between-group differences in the acquisition of domain knowledge, could not be confirmed. Together these findings suggest that the worked examples enhanced students’ performance during the task as expected, but did not lead to higher learning outcomes, as was substantiated by the results of the regression analysis. This conclusion is in line with work by Hübner, Nückles, and Renkl (2010), who also found that worked examples affect the learning process but not the immediate learning outcomes. However, Hübner et al. did find a worked example effect on a delayed transfer task, which they explain in light of strategydevelopment literature. They argue that unfamiliar strategies do not necessarily enhance learning during the initial stage of usage because the available cognitive capacity must be largely devoted to the application of the strategy. Future research should examine whether this delayed task effect also applies to heuristic worked examples and inquiry learning. Another explanation could be that students in the present study were unaware that their knowledge of the task would be tested afterwards. It is possible that the students focused only on performing the task well instead of learning about the phenomena. This explanation would account for the negative relation between relevant relations in the model, which is an indication of task performance, and posttest performance, as shown by the regression analysis. Hence, for future research it appears to be important to inform students that their topical knowledge will be assessed after the inquiry and modelling activity. A third alternative explanation is that a mediation effect caused the lack of effect on the posttest (cf. Mackinnon, 2008). The positive correlation between the number of worked examples viewed and posttest scores supports the presumption that worked examples enhance learning. Consequently, the overall effect of the worked examples on posttest performance might have been cancelled out by a third variable. As the worked examples were presented in a different domain, students from the experimental group ultimately devoted less time to the learning content in the tested domain. The time spent on viewing the worked examples could therefore be a suppressor mediating variable. However, the nonsignificant correlation between time spent viewing the worked examples and posttest performance indicates that time spent in another learning domain had no detrimental effect on learning.

Y.G. Mulder et al. / Learning and Instruction 29 (2014) 56e64

In the present study, the quality of the final models created by the students and performance on the posttest were quite modest. This could be due to students’ slow advancement through the model progression phases. All students spent most of their time in Phase 1, creating a model outline, leaving little time for specifying the model content in Phases 2 and 3. However, the posttest addressed the contents of all three phases, meaning that relatively many students were tested on subject matter they had not investigated during the session. It could therefore be that the low performance is at least in part caused by time constraints during the task. Similar problems arose during previous studies in which students were not supported by worked examples (Mulder et al., 2011, 2012). It thus seems that the current worked examples did not help students progress through all three phases in the given amount of time. Future research should examine what is needed for students to advance through all model progression phases. The present findings suggest that students could either be given more time on task, or more appropriate support. Research on the latter option could go in several directions. One possibility would be to replace worked examples by different, more fruitful forms of support. Scaffolding frameworks (e.g., Quintana et al., 2004) provide a good starting point for this line of research, and the review by VanLehn (2013) suggests that such support should focus on the process of model construction, which could take the form of prompts, hints and feedback. A second option would be to improve the implementation of worked examples. For instance, viewing the worked examples could be made obligatory (cf. Rummel, Spada, & Hauser, 2009); in this study a few students did not watch the examples, which might have reduced the worked example effect. Alternatively, the worked example effect could be enhanced by adding self-explanation prompts or using different example types (e.g., completion problems, process worked examples, partial solutions). For instance, offering a partially completed model outline might solve the problem that students spend too much time on the first part of the task. A third possibility is to optimize the design of heuristic worked examples. Prior work has extensively investigated the design of traditional worked examples, but the design of heuristic worked examples is relatively unexplored. Design principles from traditional worked examples (e.g., adding self-explanation prompts) do not necessarily apply to heuristic worked examples (Renkl et al., 2009). One issue in particular is that prior research leaves it somewhat unclear what domain should be used for the heuristic worked examples. In some studies, the examples were situated in more-or-less the same domain as the actual task (e.g. Hilbert & Renkl, 2009; experiment 2; Hilbert et al., 2008). However, other studies used different contents for the heuristic examples and actual task (e.g. Hilbert & Renkl, 2009; experiment 1), which according to Moreno et al. (2009) increases metacognition during learning. As domain knowledge and inquiry skills are mutually dependent (e.g., Klahr & Dunbar, 1988), the effectiveness of the heuristic examples might depend on the subject matter they contain. In conclusion, the heuristic worked examples were found to enhance students’ inquiry behaviour and the quality of the models they create, as expected. However, in this study these performance effects did not yet translate into posttest effects, for which several explanations have been presented. Thus, as used here, heuristic worked examples can be effectively applied to enhance students’ inquiry processes and model quality. However, for effective support with regard to domain knowledge acquisition, continued iterative rounds of design and evaluation are needed, as with any novel application of learning support.

63

References Alessi, S. (1995). Dynamic versus static fidelity in a procedural simulation. In Paper presented at the annual meeting of the American Educational Research Association, San Francisco. Alfieri, L., Brooks, P. J., Aldrich, N. J., & Tenenbaum, H. R. (2011). Does discoverybased instruction enhance learning? Journal of Educational Psychology, 103, 1e 18. http://dx.doi.org/10.1037/a0021017. Anjewierden, A. (2012). Explorations in fine-grained learning analytics. Unpublished PhD thesis. Enschede, The Netherlands: University of Twente. Atkinson, R. K., Derry, S. J., Renkl, A., & Wortham, D. (2000). Learning from examples: instructional principles from the worked examples research. Review of Educational Research, 70, 181e214. http://dx.doi.org/10.3102/ 00346543070002181. Atkinson, R. K., Renkl, A., & Merrill, M. M. (2003). Transitioning from studying examples to solving problems: effects of self-explanation prompts and fading worked-out steps. Journal of Educational Psychology, 95, 774e783. http:// dx.doi.org/10.1037/0022-0663.95.4.774. Basu, S., Dickes, A., Kinnebrew, J. S., Sengupta, P., & Biswas, G. (2013). CTSiM: a computational thinking environment for learning science through simulation and modeling. In Proceedings of the 5th International Conference on Computer Supported Education (pp. 369e378) (Aachen, Germany). Chen, Z., & Klahr, D. (1999). All other things being equal: children’s acquisition of the control of variables strategy. Child Development, 70, 1098e1120. Chi, M. T. H., Bassok, M., Lewis, M. W., Reimann, P., & Glaser, R. (1989). Self-explanations: how students study and use examples in learning to solve problems. Cognitive Science, 13, 145e182. http://dx.doi.org/10.1207/s15516709cog1302_1. Eseryel, D., & Law, V. (2010). Promoting learning in complex systems: effect of question prompts versus system dynamics model progressions as a cognitiveregulation scaffold in a simulation-based inquiry-learning environment. In Paper presented at the 9th international conferences of the learning sciences, Chicago, IL. van Gog, T., Paas, F., & van Merriënboer, J. J. G. (2008). Effects of studying sequences of process-oriented and product-oriented worked examples on troubleshooting transfer efficiency. Learning and Instruction, 18, 211e222. http://dx.doi.org/ 10.1016/j.learninstruc.2007.03.003. Hilbert, T. S., & Renkl, A. (2009). Learning how to use a computer-based conceptmapping tool: self-explaining examples helps. Computers in Human Behavior, 25, 267e274. http://dx.doi.org/10.1016/j.chb.2008.12.006. Hilbert, T. S., Renkl, A., Kessler, S., & Reiss, K. (2008). Learning to prove in geometry: learning from heuristic examples and how it can be supported. Learning and Instruction, 18, 54e65. http://dx.doi.org/10.1016/j.learninstruc.2006.10.008. Hogan, K., & Thomas, D. (2001). Cognitive comparisons of students’ systems modeling in ecology. Journal of Science Education and Technology, 10, 319e345. http://dx.doi.org/10.1023/A:1012243102249. Hübner, S., Nückles, M., & Renkl, A. (2010). Writing learning journals: instructional support to overcome learning-strategy deficits. Learning and Instruction, 20, 18e 29. http://dx.doi.org/10.1016/j.learninstruc.2008.12.001. de Jong, T., & van Joolingen, W. R. (1998). Scientific discovery learning with computer simulations of conceptual domains. Review of Educational Research, 68, 179e201. http://dx.doi.org/10.3102/00346543068002179. van Joolingen, W. R., de Jong, T., Lazonder, A. W., Savelsbergh, E. R., & Manlove, S. (2005). Co-Lab: research and development of an online learning environment for collaborative scientific discovery learning. Computers in Human Behavior, 21, 671e688. http://dx.doi.org/10.1016/j.chb.2004.10.039. de Jong, T., Martin, E., Zamarro, J. M., Esquembre, F., Swaak, J., & van Joolingen, W. R. (1999). The integration of computer simulation and learning support: an example from the physics domain of collisions. Journal of Research in Science Teaching, 36, 597e615. http://dx.doi.org/10.1002/(SICI)1098-2736(199905)36: 5<597::AID-TEA6>3.0.CO;2e6. Klahr, D., & Dunbar, K. (1988). Dual space search during scientific reasoning. Cognitive Science, 12, 1e48. http://dx.doi.org/10.1207/s15516709cog1201_1. Lewis, D., & Barron, A. (2009). Animated demonstrations: evidence of improved performance efficiency and the worked example effect. In Paper presented at the 1st international conference on human centered design, San Diego, CA. Lusk, M. M., & Atkinson, R. K. (2007). Animated pedagogical agents: does their degree of embodiment impact learning from static or animated worked examples? Applied Cognitive Psychology, 21, 747e764. http://dx.doi.org/10.1002/ acp.1347. Mackinnon, D. P. (2008). Introduction to statistical mediation analysis. Mahwah, NJ: Erlbaum. Manlove, S., Lazonder, A. W., & Jong, T. d (2009). Trends and issues of regulative support use during inquiry learning: patterns from three studies. Computers in Human Behavior, 25, 795e803. http://dx.doi.org/10.1016/j.chb.2008.07.010. Minner, D. D., Levy, A. J., & Century, J. (2010). Inquiry-based science instructionwhat is it and does it matter? Results from a research synthesis years 1984 to 2002. Journal of Research in Science Teaching, 47, 474e496. http://dx.doi.org/ 10.1002/tea.20347. Moreno, R., Reisslein, M., & Ozogul, G. (2009). Optimizing worked-example instruction in electrical engineering: the role of fading and feedback during problem-solving practice. Journal of Engineering Education, 98(1), 83e92. http:// dx.doi.org/10.1002/j.2168-9830.2009.tb01007.x. Mulder, Y. G., Lazonder, A. W., & de Jong, T. (2010). Finding out how they find it out: an empirical analysis of inquiry learners’ need for support. International

64

Y.G. Mulder et al. / Learning and Instruction 29 (2014) 56e64

Journal of Science Education, 32, 2033e2053. http://dx.doi.org/10.1080/ 09500690903289993. Mulder, Y. G., Lazonder, A. W., & de Jong, T. (2011). Comparing two types of model progression in an inquiry learning environment with modelling facilities. Learning and Instruction, 21, 614e624. http://dx.doi.org/10.1016/ j.learninstruc.2011.01.003. Mulder, Y. G., Lazonder, A. W., de Jong, T., Anjewierden, A., & Bollen, L. (2012). Validating and optimizing the effects of model progression in simulation-based inquiry learning. Journal of Science Education and Technology. http://dx.doi.org/ 10.1007/s10956-011-9360-x. Advance online publication. Quinn, J., & Alessi, S. (1994). The effects of simulation complexity and hypothesisgeneration strategy on learning. Journal of Research on Computing in Education, 27, 75e91. Quintana, C., Reiser, B. J., Davis, E. A., Krajcik, J., Fretz, E., Duncan, R. G., & Soloway, E. (2004). A scaffolding design framework for software to support science inquiry. The Journal of the Learning Sciences, 13, 337e386. Renkl, A., Hilbert, T., & Schworm, S. (2009). Example-based learning in heuristic domains: a cognitive load theory account. Educational Psychology Review, 21, 67e78. http://dx.doi.org/10.1007/s10648-008-9093-4. Rieber, L. P., & Parmley, M. W. (1995). To teach or not to teach? Comparing the use of computer-based simulations in deductive versus inductive approaches to learning with adults in science. Journal of Educational Computing Research, 13, 359e374. http://dx.doi.org/10.2190/M8VX-68BC-1TU2-B6DV. Roscoe, R. D., Segedy, J. R., Sulcer, B., Jeong, H., & Biswas, G. (2013). Shallow strategy development in a teachable agent environment designed to support selfregulated learning. Computers & Education, 62, 286e297. http://dx.doi.org/ 10.1016/j.compedu.2012.11.008.

Rummel, N., Spada, H., & Hauser, S. (2009). Learning to collaborate while being scripted or by observing a model. International Journal of ComputerSupported Collaborative Learning, 4, 69e92. http://dx.doi.org/10.1007/ s11412-008-9054-4. Stratford, S. J., Krajcik, J., & Soloway, E. (1998). Secondary students’ dynamic modeling processes: analyzing, reasoning about, synthesizing, and testing models of stream ecosystems. Journal of Science Education and Technology, 7, 215. http://dx.doi.org/10.1023/A:1021840407112. Swaak, J., van Joolingen, W. R., & de Jong, T. (1998). Supporting simulation-based learning; the effects of model progression and assignments on definitional and intuitive knowledge. Learning and Instruction, 8, 235e252. http:// dx.doi.org/10.1016/s0959-4752(98)00018-8. Sweller, J., Ayres, P., & Kalyuga, S. (2011). Cognitive load theory: Explorations in the learning sciences, instructional systems and performance technologies. New York, NY: Springer. Sweller, J., & Cooper, G. A. (1985). The use of worked examples as a substitute for problem solving in learning algebra. Cognition and Instruction, 2, 59e89. VanLehn, K. (2013). Model construction as a learning activity: a design space and review. Interactive Learning Environments, 21, 371e413. http://dx.doi.org/ 10.1080/10494820.2013.803125. White, B. Y., & Frederiksen, J. R. (1990). Causal model progressions as a foundation for intelligent learning environments. Artificial Intelligence, 42, 99e157. http:// dx.doi.org/10.1016/0004-3702(90)90095-h. White, B. Y., Shimoda, T. A., & Frederiksen, J. R. (1999). Enabling students to construct theories of collaborative inquiry and reflective learning: computer support for metacognitive development. International Journal of Artificial Intelligence in Education, 10, 151e182.