Learning and Instruction xxx (xxxx) xxx–xxx
Contents lists available at ScienceDirect
Learning and Instruction journal homepage: www.elsevier.com/locate/learninstruc
Self-regulated learning of principle-based concepts: Do students prefer worked examples, faded examples, or problem solving? Nathaniel L. Fostera,b,∗, Katherine A. Rawsonb, John Dunloskyb a b
St. Mary's College of Maryland, 47645 College Drive, St. Mary's City, MD 20686, USA Kent State University, 800 East Summit St., Kent, OH 44240, USA
A R T I C L E I N F O
A B S T R A C T
Keywords: Worked examples Problem solving Self-regulated learning Principle-based concepts
Acquisition of principle-based concepts involves learning how and when to apply a specific principle to different instances of the same problem type. Within this domain, learning is best achieved when practice involves studying worked examples followed by problem solving. When given the choice to use worked examples versus problem solving, how do people regulate their learning? Furthermore, do they use faded examples effectively when given the opportunity during learning? In three experiments, participants learned how to solve probability problems under practice conditions involving either (a) a combined schedule of worked examples, partial examples (Experiments 2 and 3), and problem solving, (b) problem solving only, or (c) self-regulated learning in which participants could choose a worked example, a partial example (Experiments 2 and 3), or problem solving on each trial. Self-regulated learners chose to study worked examples on fewer than 40% of the trials and seldom did so prior to problem solving. However, participants did regulate their learning effectively when they could use partial examples during practice. Participants also demonstrated some sophisticated problem solving, such as by studying worked examples more often after failed versus successful problem-solving attempts.
1. Introduction Many academic domains involve learning principle-based concepts. A principle-based concept (henceforth, “principle” for brevity) typically requires the use of a formula or algorithm for solving a specific type of problem. Once the principle is known, it can be applied to new instances of the same problem type. For instance, if students know the Pythagorean Theorem, they can use it to calculate the length of sides of a right triangle of any size. Principles comprise a core part of foundational knowledge in many academic domains, including physics, engineering, chemistry, math, and computer programming. Because acquisition of principles depends in part on students' learning and practice on their own outside of class, students’ success will rely on how well they regulate their learning of these principles. However, little is known about how effectively students use strategies—such as studying worked examples and solving problems—when learning the principles. Accordingly, in the current research, we investigated how students use different strategies to learn principles in a math domain. When learning principles, two primary strategies available to students include (a) studying an example problem in which each step is worked out and presented alongside the solution (referred to as a worked example, Sweller & Cooper, 1985; Cooper & Sweller, 1987;
∗
Sweller, 1988) and (b) attempting to solve a problem from start to finish with no support (referred to as problem solving). As described further below, these two strategies are differentially effective, and hence investigating how students control their use of them has important implications for enhancing student learning. This issue may be particularly important due to the growing use of automated tutors for instruction in educational settings, particularly systems that require self-regulation. For example, learning of principles in the Assessment and Learning in Knowledge Spaces (ALEKS) system, used widely to support learning of mathematics for grades K-12 and college-level courses (http://www.aleks.com), is almost entirely student regulated. ALEKS does not prescribe how students should learn different principles, but merely provides students with two choices during each learning trial for a given principle: study a worked example or try to solve the problem alone. Thus, the efficacy of such learning technologies may be improved by investigating how students control their learning of principles. Moreover, if students do not use the strategies effectively, they may require training or strategy scaffolds to regulate their studying effectively, regardless of whether they must regulate all of their learning or are being supported by technology (e.g., see Greene, Dellinger, Tüysüzoğlu, & Costa, 2013; Renkl, Berthold, Groβe, & Schwonke, 2013).
Corresponding author. Department of Psychology, St. Mary's College of Maryland, 47645 College Dr., St. Mary's City, MD 20686, USA. E-mail address:
[email protected] (N.L. Foster).
http://dx.doi.org/10.1016/j.learninstruc.2017.10.002 Received 16 December 2016; Received in revised form 13 October 2017; Accepted 16 October 2017 0959-4752/ © 2017 Elsevier Ltd. All rights reserved.
Please cite this article as: Foster, N.L., Learning and Instruction (2017), http://dx.doi.org/10.1016/j.learninstruc.2017.10.002
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
greater benefits in learning compared to when they only solve problems (e.g., Carroll, 1994; Cooper & Sweller, 1987; Kalyuga & Sweller, 2004; Mwangi & Sweller, 1998; Retnowati, Ayres, & Sweller, 2010; Rourke & Sweller, 2009; Sweller & Cooper, 1985; Sweller, 1988; Sweller, Chandler, Tierney, & Cooper, 1990; Ward & Sweller, 1990). The benefit of WEPS schedules (in which one studies a worked example of a problem and then solves a new one) is greater than solving a problem first and then reviewing a worked example (Leppink, Paas, Van Gog, Van der Vleuten, & Van Merriënboer, 2014; Van Gog, Kester, & Paas, 2011). Furthermore, WEPS schedules that involve a technique known as fading—which involves transitioning from fully worked examples to presenting part of a worked example and having the participant solve the rest—often produce better performance as compared to a standard WEPS schedule (Atkinson, Derry, Renkl, & Wortham, 2000; Renkl et al., 2004; Renkl et al., 2002; but see; Reisslein, Atkinson, Seeling, & Reisslein, 2006). Given that knowledge acquisition is gradual, the intermediate scaffolding provided by faded examples supports the transition from schema acquisition to schema application. In summary, the key aspect of the practice schedule that is normatively most effective for novice learners is the scheduling of worked examples prior to problem solving. In the present studies that investigate problem solving by novices, what kind of study schedule will students use? Possible answers to this question cannot be derived from theory of object-level processing (e.g., cognitive load theory) because the monitoring-control relationship that involves the meta-level is relevant. According to the monitoring-affectscontrol hypothesis (Nelson & Leonesio, 1988), learners monitor their progress and control subsequent learning by focusing more study on items judged as least-well learned (vs. more-well learned). This hypothesis has been confirmed many times in contexts where students are attempting to memorize simple stimuli (for reviews, see citations; Metcalfe & Kornell, 2005; Dunlosky & Ariel, 2011). Moreover, students view self testing—which is akin to simulating the criterion test—as a means to monitor their progress (e.g., Hartwig & Dunlosky, 2012; Kornell & Bjork, 2007). For instance, when given unlimited time to master a list of words for an upcoming criterion test, students will try to recall the words from memory (i.e., without looking at the list), so as to decide whether they are ready to take the criterion test or need to study more (e.g., Murphy, Schmitt, Caruso, & Sanders, 1987). In the present context where students are learning to solve problems, they have the option to either solve a problem or study a worked example. In the context of the SRL framework, solving a problem represents a means to monitor using a self test, which can then inform a decision about whether studying a worked example is needed. This distinction leads to the following two expectations. First, students will typically begin studying by solving a problem, so as to monitor how well they can already solve it. The expectation here is that many students will not begin with a worked example, which is normatively most effective (as expected from cognitive load theory). Second, the monitoring-affects-control hypothesis predicts what students will do after attempting to solve a problem. If they incorrectly solve a particular kind of probability problem (e.g., one where the order of events is relevant), they will be more likely to study a worked example than to solve another problem of the same kind. By contrast, if they correctly solve a problem, they will be more likely to stop studying that particular kind of problem than to study a worked example or attempt to solve another problem of that kind.
Fig. 1. Meta-level and object-level form a dominance relation that gives rise to monitoring and control processes. The meta-level includes a model of the on-gong task. See text for details. Adapted from Nelson and Narens (1990).
1.1. Frameworks of self-regulated learning In the present studies, a self-regulated learning (SRL) framework will be used to develop expectations for how students will use worked examples and problem solving when learning principles. Cognitive frameworks of SRL are founded on the relationship between metacognitive monitoring and control processes (e.g., Dunlosky & Ariel, 2011; Winne & Hadwin, 1998). As discussed in detail by Nelson and Narens (1990; 1994), monitoring and control processes arise from the interplay between a meta-level and an object-level system (Nelson & Narens, 1990). The object level system refers to the underlying cognitive processes and structures, such as perceiving and interpreting stimuli within a limited-capacity memory system. As illustrated in Fig. 1, the metalevel system consists of cognitions about the object-level system that pertain either to monitoring information from the object-level or attempting to control on-going object-level processing. One assumption of this framework is that monitoring is used to infer the current level of progress so as to control object-level processing. In the context of the current studies, a student may monitor that a particular problem is taking a long time to solve and infer that she is not making sufficient progress; if so, she may control subsequent processing by changing strategies, such as by studying a worked example instead of attempting to solve another problem. Hypotheses about how students will self regulate their learning are generated by an understanding of the meta-level monitoring and control processes, whereas hypotheses about how different factors will influence the object-level are generated from theories of the underlying cognitive processes and their interactions. Accordingly, before considering hypotheses about how students control their learning, we briefly discuss a framework of object-level processing, the cognitive load theory (CLT), which is a framework commonly used in the literature on learning principle-based concepts (e.g., Paas, Renkl, & Sweller 2003; Paas, Van Gog, & Sweller, 2010; Sweller, 1988, 1994). CLT assumes students have limited cognitive resources that are differentially expended on processing that either promotes or hinders learning (referred to as germane load versus extraneous load, respectively) depending on various factors such as prior knowledge or task complexity. For novices, learning principles ideally involves a transition from schema acquisition to schema application. Worked examples are beneficial during the initial stage of schema acquisition because learners can focus their limited cognitive resources on understanding the principle. If students attempt to apply the principle before a schema is fully developed, they often adopt suboptimal problem solving methods (e.g., a means-end analysis strategy) and schema acquisition suffers. However, once a schema has been sufficiently developed, worked examples become redundant with information already learned and limited resources can be better spent on practicing schema application via problem-solving to increase learning outcomes (Kalyuga, 2007; Kissane, Kalyuga, Chandler, & Sweller, 2008; Renkl, Atkinson, & Große, 2004; Renkl, Atkinson, Maier, & Staley, 2002). Consistent with these theoretical assumptions, when novices begin by studying worked examples prior to problem solving (Worked Examples followed by Problem Solving, or a WEPS schedule), they have
1.2. Overview of current experiments Given the lack of research relevant to students’ use of problem solving and worked examples while regulating their learning, the purpose of the current study was to examine how learners control their example-based learning of principles. To this end, we conducted three experiments in which novices learned how to solve two types of probability problems. In Experiment 1, after an initial pre-knowledge 2
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
than half of the practice trials, indicating insufficient compliance with task instructions. Thus, the final sample included 102 participants (PS = 34, WEPS = 33, SRL = 35). A power analysis conducted using G*Power 3.1.9.2 (Faul, Erdfelder, Lang, & Buchner, 2007) for a one-way ANOVA with power set at 0.80 and α = 0.05 indicated that this sample size afforded sufficient sensitivity to detect medium-size effects (f ≥ 0.31).
assessment, participants received brief instruction on how to solve the two types of problems and then practiced 12 probability problems. Most important for our present aims was the self-regulated learning (SRL) group. Participants in this group were given a choice on each trial either to study a worked example, in which each step of the problem including the solution was presented to them, or to engage in problem solving, in which only the problem question was presented for the participant to solve on their own. Of main interest was whether participants in the SRL group would use one or both of these strategies, and if they did use both, how they coordinated their use while learning the principles. Of secondary interest, we also included two other comparison groups. In the WEPS group, participants received a worked example for the first problem and then engaged in problem solving for the second problem. This alternating pattern continued for the 10 remaining problems. In the problem-solving (PS) group, participants solved all 12 problems. After practice, all groups completed a final test of near transfer involving questions with the same deep-structure features but different surface features than the practice problems. In Experiments 2 and 3, we included replication groups plus two extension groups. These groups were used to investigate students’ use of faded examples as they solve problems. In the faded WEPS group, participants transitioned from studying worked examples to solving partially completed problems (which we refer to as partial formats hereafter for brevity) to full problem solving. In the faded SRL group, participants could choose between studying a worked example, solving a partial format, or fully solving a problem on each trial. In this latter group, a key question is: Will students implement a faded practice schedule (i.e., transitioning from worked examples to partial formats to problem solving) as they practice the problems? Outcomes of primary interest in all three experiments concern the control decisions that students make during self-regulated learning. In the introduction to each experiment, we briefly review the focal hypotheses being tested.
2.1.2. Materials Materials consisted of a brief instructional text and three sets of probability items (for pretest, practice, and final test; examples of problems are listed in Appendix A). The instructional text explained the difference between relevant-order and irrelevant-order probability problems by presenting concrete examples and describing how to solve each kind of problem. More specifically, participants were told that correctly solving a relevant-order problem involves recognizing that there is only one order in which events described in the problem can unfold (e.g., What is the probability that Sally draws a king of hearts first, puts it back, and then draws a jack?). An irrelevant-order problem allows two possible orders to be considered (e.g., What is the probability that Sally draws a king of hearts first, and then a jack, OR a jack first, and then a king of hearts, all with replacement?). The pretest included the same 13 items that Groβe and Renkl (2007) used for their pretest of probability knowledge. The pretest items varied in difficulty and covered a range of probability principles, with only two items that required knowledge of how to solve relevant versus irrelevant order problems that were the focus of the main experimental materials. Because pretest items tapped a broader range of probability principles and different question formats than during practice and on the final test, the pretest scores are not directly comparable to practice and final test scores. For this reason, we refrain from making any assertions about changes in performance from pretest to final test. The practice phase included six items for each of the two different types of probability principles (relevant order problem and irrelevant order problem). The final test consisted of eight new probability items, four requiring the relevant order problem and four requiring the irrelevant order problem. Two items in the practice phase and two items in the final test phase were from Groβe and Renkl's (2007) materials. We constructed the remaining items used in the practice and final test phases.
2. Experiment 1 Based on the SRL framework described above, we expected that participants would use both the worked example and the problem-solving formats. The first expectation is that participants will typically begin by attempting to solve a problem. And, a key prediction from the monitoringaffects-control hypothesis is that participants will be more likely to study a worked example after they incorrectly solve a practice problem than when they correctly solve one. Finally, although of secondary importance, based on CLT and prior empirical outcomes, we predicted greater final test performance for the WEPS group than for the PS group.
2.1.3. Procedure Participants first completed the pretest, with items presented in the same order for each participant. Each item was presented on a computer screen individually, and participants were permitted to use a paper, pencil, and calculator to solve the problems. Responses were entered into a text field on the computer. Participants were allowed a maximum of 13 min to complete the pretest. After the pretest, participants were required to spend a minimum of 5 min studying the two pages of the instructional text, which described the steps of solving relevant and irrelevant order probability problems. Participants could toggle back and forth between pages to freely navigate the instructional text. After 5 min had elapsed, a “Proceed” button was revealed that allowed participants to exit the instructional text when they were finished studying. Next, participants were asked to rate how well they understood the instructions for how to solve the two problem types on a scale from 1 (“I did not understand the instructions at all”) to 5 (“I completely understood the instructions”). Overall, mean ratings were on the high end of the scale (M = 4.18, SD = 0.75), suggesting that participants understood the instructions. Completion of each trial within the practice phase was self-paced, but we gave participants a time limit of 40 min overall to complete all 12 trials. Items were presented on the screen one at a time, and the format of the items differed depending on group assignment. Panels A and B of Fig. 2 show problems presented in the worked example and problem solving formats, respectively. For the PS group, all items
2.1. Method 2.1.1. Participants and design Our sample was selected from the Kent State University participant pool, which has a mean age of 19.58, is 80% female, and 51% of which are first-year students. One hundred twenty undergraduates from Kent State University participated in exchange for partial course credit. Participants were randomly assigned to three groups, the problem solving (PS) group, the worked example plus problem solving (WEPS) group, and the self-regulated learning (SRL) group. Because we were primarily interested in the selection behavior of novices, we excluded 15 participants with scores greater than 60% correct on the pretest.1 We also excluded three participants who spent fewer than 10 s on more 1 We were interested in novices because much of the research on principle-based concept learning has focused exclusively on this subset of learners (e.g., Fyfe, DeCaro, & Rittle-Johnson, 2015; Durkin & Rittle-Johnson, 2012; Sweller et al., 1990; Schwonke et al., 2009; Van Gog & Kester, 2012; Van Gog, Paas, & van Merriënboer, 2006).
3
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
numbered trials) to a problem to solve (on even-numbered trials). Worked example formats involved presenting the question prompt along with the worked-out steps required to solve that problem and the answer. After studying the worked example, participants clicked a button at the bottom of the screen labeled “Move on to the next problem”. For the SRL group, each item was preceded by a screen that described both the problem-solving and the worked-example formats. Participants clicked on a button marked “PS” or “WE” to select the format for the subsequent trial. Worked example and problem-solving formats in the SRL group were administered the same way as in the other two groups. Practice trials were arranged into two blocks, one including the six relevant order items and the other including the six irrelevant order items. The order of trials in each block was randomized by the computer for each participant, and we counterbalanced across participants whether the relevant-order block or the irrelevant-order block was completed first. After the practice phase, participants moved on immediately to the final test. The final test items were presented one at a time in random order. Items appeared in a problem-solving format only and no feedback was given. Participants had a maximum of 16 min to complete the final test. 2.2. Results and discussion To examine how participants in the SRL group chose to control their learning of principles, we first report the selection behavior for the SRL group. For completeness, we then report accuracy results (proportion correct) for each of the three groups in each of the three experiment phases. All Cohen's d values reported below were computed using pooled standard deviations (Cortina & Nouri, 2000). Cohen's d values of 0.2, 0.5, and 0.8 are considered small, medium, and large effect sizes, respectively (Cohen, 1988). 2.2.1. Selection behavior in SRL group Of primary interest, we first analyzed selection behavior during the practice phase. For each participant, we calculated the percentage of trials on which each of the two trial formats was selected. Mean percentage selection of worked examples versus problem solving is plotted in the far left panel of Fig. 3. Overall, participants selected worked examples on 27% of trials, around half of what the WEPS group experienced. Fig. 4 reports the mean selection proportions for the two formats across serial position in the practice phase. Consistent with the expectations from the SRL framework, participants preferred to start the practice phase with an attempt to solve a problem with no support and selected worked examples for the first trial in the practice phase only 31% of the time. In fact, the problem-solving format was preferred on all trials except for the second trial, for which there was a slight preference for worked example formats. The finding that more participants selected a worked example for Trial 2 may be due to participants’ sensitivity to their low performance on Trial 1. Indeed, consistent with the prediction from the monitoringaffects-control hypothesis, of the participants who selected problem solving for Trial 1 and answered incorrectly, 85% chose to view a worked example on Trial 2. Of the participants who selected problem solving on Trial 1 and answered correctly, 0% chose a worked example on Trial 2. This finding suggests that participants conditionally selected worked examples for Trial 2 when a problem was incorrectly solved on the preceding trial. We also examined how likely participants were to choose a worked example after having answered incorrectly versus correctly on a prior problem-solving trial across the entire practice phase. As reported in Table 3, the probability of selecting a worked example on any given trial when the preceding trial was both problem solving and incorrect was significantly greater compared to the probability of selecting a worked example when the preceding trial was both problem solving and correct, t(26) = 6.34, p < 0.001, d = 1.58.
Fig. 2. Illustration of the three formats used to present practice problems. Panel A: worked example format; Panel B: problem solving format; and Panel C: partial format (used in Experiments 2 & 3 only).
included only the question prompt and participants were required to solve for and then enter the correct response. After participants submitted a response on a given item, the computer presented them with feedback. Correct responses were simply described as being “Correct!”, whereas incorrect responses involved presenting a brief tutorial screen that described (a) whether the preceding item was a relevant order or irrelevant order item, and (b) how to solve that type of item in general. No specific feedback (i.e., the answer or the worked-example steps) was provided for incorrect responses. For the WEPS group, the format of items in the practice phase alternated from a worked example (on odd4
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
Fig. 3. The mean selection percentages of worked examples (WE) and problem solving (PS) in the SRL groups (Experiments 1, 2, and 3), and of worked examples, problem solving, and partial formats in the faded SRL groups (Experiments 2 and 3).
Fig. 4. The mean selection percentages of worked examples and problem solving in the SRL group across serial position during the practice phase (Experiment 1).
2.2.2. Pretest and practice phase accuracy Before presenting our secondary measure of final test accuracy of probability problems, we present the problem-solving accuracy during the pretest and practice phase. We report the pretest means in the leftmost column of Table 1. To test whether groups were equated on pre-knowledge of probability, we conducted a one-way analysis of variance (ANOVA) on pretest scores, which revealed a significant effect of group, F(2,99) = 4.55, MSE = 0.05, p = 0.013, η2 = .082. A HolmBonferroni correction (Gaetano, 2013; Holm, 1979) was made for the individual group comparisons (when we note the use of this correction, all subsequent p-values within the section have been corrected), which revealed a significant difference in pretest scores favoring the WEPS over the PS group (p = 0.01), but no significant differences between the
Table 1 Pretest accuracy across groups in Experiments 1, 2, and 3. Standard error of the mean is presented in parentheses.
PS WEPS SRL Faded Faded SRL
Experiment 1
Experiment 2
Experiment 3
0.36 (0.02) 0.41 (0.02) 0.33 (0.02) – –
0.35 0.37 0.35 0.41 0.36
– 0.38 0.38 0.40 0.40
(0.03) (0.02) (0.02) (0.02) (0.02)
(0.02) (0.02) (0.02) (0.02)
WEPS and SRL groups (p = 0.07) or between the PS and SRL groups (p = 0.40). Practice phase accuracy is reported in Table 2. Holm-Bonferroni corrections tests indicated that accuracy on problem solving trials was significantly greater for the WEPS group than for both the PS group (p = 0.003) and the SRL group (p = 0.006). No significant difference occurred between the PS group and SRL group (p = 0.56).
2 As Shadish, Cook, and Campbell (2002) note, “It is not a ‘failure of randomization’ if some observed means are significantly different across conditions at pretest after random assignment. Indeed, from sampling theory we expect that the fewer the number of participants assigned to conditions, the larger are the differences in pretest means that may be found … as long as randomization was implemented properly, any pretest differences that occur will always be due to chance” (p. 303).
5
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
3. Experiment 2
Table 2 Accuracy on problem-solving and partial-format trials across groups and phases in Experiments 1, 2, and 3. Standard error of the mean is presented in parentheses. Practice
In Experiment 1, when participants were given control of the format for practice on each trial, they did not effectively tailor their schedule of learning. The purpose of Experiment 2 was (1) to replicate the focal outcomes of Experiment 1, given recent emphasis in the field on the importance of replicating novel findings (Lishner, 2015; Maner, 2014; Schmidt, 2009; Simons, 2014) and (2) to include an extension designed to further examine participants’ control decisions during example-based learning of principles. To this end, we included the same three groups from Experiment 1 plus two additional groups: a faded WEPS group and a faded SRL group. In the faded WEPS group, the first quarter of the trials for each problem type were fully worked examples, the next quarter were partial formats in which all steps were listed except for the final solution step (as per Renkl et al., 2002), and the final half were problem-solving formats. On each trial in the faded SRL group, participants were given the option to study a full worked example, a partial format, or do problem solving. One reason we investigated participants’ use of faded examples is that they have been shown to improve learning (as compared to using worked examples and problem solving); even so, few studies have examined the benefits of faded examples (e.g., Kissane et al., 2008; Renkl et al., 2004; Renkl et al., 2002; Schmidt-Weigand, Hanze, & Wodzinski, 2009), so the present experiments also serve to further establish the possible benefits of using faded examples. Based on CLT and prior empirical outcomes, an a priori prediction is that final test performance will be greater for the faded WEPS group than for the WEPS group. The more important reason for including faded examples—in which students may benefit from exposure to partial formats—was to explore how participants use this format when regulating their learning. Relevant to expectations from the SRL framework, partial formats also allow participants to monitor how well they can solve a problem. In contrast to the full worked example, having partial formats available for selection provides participants the option to use a scaffold but still actively participate in the generation of the solution. Thus, one possibility is that participants will prefer the partial format over the full problem-solving format. Consistent with this possibility, Van Merriënboer, Schuurman, de Croock, and Paas (2002, Experiment 1) had novice computer programmers read about and practice unfamiliar programming problems for 180 min. During this time, participants compiled, tested, and debugged blocks of broken code. Participants in the learner-controlled group were allowed to switch between two problem formats: a problem-solving format and a partially completed format for each block. The problem-solving format presented the problem and then required participants to implement the appropriate programming steps to compute the correct answer without any support. The partially completed format presented some of the programming steps for that problem, and participants were required to finish solving the problem using these provided steps. The learner-controlled group showed a preference to select partially completed formats compared to problem-solving formats. These results suggest that in the present
Final Test
Problem Solving
Partial Format
Experiment 1 PS WEPS SRL
0.25 (0.04) 0.47 (0.05) 0.28 (0.04)
– – –
0.29 (0.05) 0.45 (0.05) 0.32 (0.04)
Experiment 2 PS WEPS SRL Faded WEPS Faded SRL
0.33 0.40 0.37 0.47 0.47
(0.05) (0.04) (0.03) (0.04) (0.06)
– – – 0.67 (0.04) 0.58 (0.06)
0.36 0.39 0.37 0.40 0.46
(0.06) (0.05) (0.04) (0.05) (0.05)
Experiment 3 WEPS SRL Faded WEPS Faded SRL
0.42 0.37 0.59 0.47
(0.04) (0.05) (0.05) (0.06)
– – 0.61 (0.06) 0.60 (0.06)
0.32 0.39 0.49 0.48
(0.05) (0.04) (0.05) (0.05)
2.2.3. Final test accuracy Final test accuracy is reported in Table 2. As noted in our introduction to Experiment 1, we used a planned comparison (one-tailed) for evaluating final test performance of the WEPS versus the PS group (as per the a priori prediction from CLT). Final test performance was significantly greater for the WEPS than PS group, t(65) = 2.34, p = 0.01 (one-tailed), d = 0.57, replicating the modal finding in the example-based learning literature. Given that final test performance was not the focus of the present research, we did not include all the possible exploratory comparisons in the text. However, given that these comparisons may be relevant to readers interested in conducting metaanalyses, we include the outcomes from the inferential tests and effects sizes for these comparisons (for all experiments) in Table B1 (Appendix B).
2.2.4. Summary To summarize, the SRL group used worked examples on a relatively small proportion of trials, indicating that these participants did not give themselves much opportunity to learn from worked examples. Furthermore, most participants did not select a worked example on the first trial, which is when worked examples are particularly advantageous. Consistent with expectations, participants were more likely to switch to worked examples after they had incorrectly (versus correctly) solved a problem on a previous practice trial. Thus, participants showed a preference for worked examples that depended on (1) selection of problem solving on the preceding trial, and (2) incorrect solution to the preceding problem solving trial. In most other cases, problem solving was preferred.
Table 3 Conditional selection probabilities for SRL and Faded SRL groups in Experiments 1, 2, and 3. Experiment
Trial n Selection
Trial n-1 Selection and Accuracy Problem Solving (SRL Group)
1 2 3
Worked Example Worked Example Partial Worked Example Partial
Problem Solving (Faded SRL Group)
Partial Format (Faded SRL Group)
Incorrect
Correct
Incorrect
Correct
Incorrect
Correct
0.45 0.50 – 0.37 –
0.08 0.07 – 0.04 –
– 0.15 0.34 0.09 0.30
– 0.03 0.04 0.04 0.07
– 0.31 0.52 0.20 0.57
– 0.02 0.66 0.05 0.63
6
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
Fig. 5. The mean selection percentages of worked examples and problem solving in the SRL group across serial position during the practice phase (Experiment 2).
Panel C of Fig. 2 provides an example problem presented in the partial format. Another critical difference in procedure pertains to the order and format of items presented to the faded WEPS and faded SRL groups during the practice phase. In the faded WEPS group, the first three items for the first problem type were presented in a worked-example format, the next three in a partial format, and the following six in a problem-solving format. The same order of formats was used for the next block of 12 items for the other problem type. In contrast, on each trial in the faded SRL group, participants were given the option to select a worked example, a partial format, or problem solving.
context participants may prefer to use the partial format more than the full problem-solving format. Note, however, that the methods used by Van Merriënboer et al. (2002) were different than those used in the present studies in many ways, including the problems used and perhaps most notably that participants in the prior study did not have the option to study a worked example. Thus, whether participants will prefer using the partial format or the full problem-solving format in the present context remains an open question. As in Experiment 1, we did expect participants to prefer beginning with formats that allowed monitoring (partial and full problem solving formats) than with the worked example. And, as per the monitoring-affects-control hypothesis, participants were expected to be more likely to study a worked example after they incorrectly solved a problem than when they correctly solved one.
3.2. Results and discussion 3.2.1. Selection behavior in SRL and faded SRL groups To understand how the SRL and faded SRL groups controlled their learning during the practice phase, consider the middle panel of Fig. 3, which presents (a) the proportion of trials selected as worked examples versus problem solving in the SRL group and (b) the proportion of trials that were selected as worked examples versus partial format versus problem solving in the faded SRL group. In the SRL group, participants selected worked examples on 33% of all trials. Thus, consistent with expectations from the SRL framework and replicating outcomes for the SRL group in Experiment 1, participants selected problem solving for the majority of trials. When participants were given a partial format option in the faded SRL group, the format that received the greatest percentage of selections was still problem solving (45%). These participants selected fewer worked examples (16%) than the SRL group, instead favoring the partial option, which was selected on 40% of the trials. To evaluate the timing of format selection for both SRL groups, we calculated the mean selection proportions for the different formats across serial positions (see Figs. 5 and 6). Participants in the SRL group selected worked examples 18% of the time for the first trial. As shown in Fig. 5, this pattern continued across all serial positions except for the second trial, which replicated the general serial position findings in the SRL group from Experiment 1 (see Fig. 4). Fig. 6 illustrates the selection behavior for the faded SRL group. Across the first 11 serial positions, participants overwhelmingly showed a preference for problem solving and for partial formats compared to worked examples. The majority of participants started the practice phase with problem solving, then switched between partial formats and problem solving until serial position 11. At this point onward, participants primarily selected problem solving.
3.1. Method 3.1.1. Participants and design Two hundred twenty-two undergraduates from Kent State University participated in exchange for partial course credit. Participants were randomly assigned to five groups (PS, WEPS, SRL, faded WEPS, and faded SRL). We excluded 21 participants who scored higher than 60% on the pretest and an additional 19 who spent fewer than 10 s on more than half of the pretest, practice, or final test trials (indicating non-compliance with task instructions). The remaining sample included 182 participants for analysis, with N = 30 for the PS group, N = 38 for the WEPS group, N = 38 for the SRL group, N = 40 for the faded WEPS group, and N = 36 for the faded SRL group. 3.1.2. Materials Materials were the same as in Experiment 1 except that we added 12 more items to the practice phase (six additional relevant-order items and six additional irrelevant-order items). We also added two more items to the final test (one of each order type). 3.1.3. Procedure The procedure was similar to Experiment 1 except that participants had a maximum of 50 min to complete the practice phase and 20 min to complete the final test phase (due to the increase in the number of practice and final test trials). We also included partial formats for certain practice-phase trials in the faded WEPS and faded SRL groups. Items appearing in a partial format included the problem question along with each worked-out step except for the final solution step. 7
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
Fig. 6. The mean selection percentages of worked examples, problem solving, and partial formats in the faded SRL group across serial position during the practice phase (Experiment 2).
3.2.2. Pretest and practice phase accuracy Before presenting our secondary measure of final test accuracy of probability problems, we first consider test performance for pretest and practice phase. A one-way ANOVA for pretest performance (see Table 1) revealed no significant effect of group, F(4,177) = 1.31, MSE = 0.017, p = 0.27, η2 = 0.03. Practice phase accuracy is reported in Table 2. Because the practice phase data for the faded WEPS and faded SRL groups included both problem solving and partial trials, we first compare all groups on accuracy for problem solving trials. Holm-Bonferroni corrections tests indicated no significant differences between groups on problem solving (ps > 0.30). Similarly, performance on the partial formats in the faded WEPS group was not significantly different than the faded SRL group, t (70) = 1.28, p = 0.20, d = 0.30.
Despite an overall preference for problem solving in the SRL group, participants may have preferentially selected worked examples on trials following incorrect problem solving. To investigate whether selections were related to the success of prior problem solving for the SRL group, we first calculated the probability of selecting a worked example on Trial 2 after answering an incorrect problem solving format on Trial 1. Of the participants in the SRL group who selected problem solving for Trial 1 and answered incorrectly, 88% selected a worked example for Trial 2. Of the participants who selected problem solving for Trial 1 and answered correctly, 40% selected a worked example for Trial 2. Thus, participants preferred worked examples more when they missed a prior problem solving trial compared to when they got it correct. We also examined how likely they were to choose a worked example after incorrect versus correct problem solving across the entire practice phase (see Table 3). Across all 24 trials of the practice phase, the probability of selecting a worked example following incorrect problem solving was significantly greater than the probability of selecting a worked example following correct problem solving, t(36) = 9.42, p < 0.001, d = 1.59. We also conducted a parallel set of conditional analyses for the faded SRL group, which are presented in Table 3. Following incorrect problem solving across the practice phase, participants were significantly more likely to select partial formats compared to worked examples, t(29) = 2.26, p < 0.05, d = 0.65. When prior problem solving was correct, selection rates of partial format and worked examples dropped to near zero and were not significantly different from each other, t(23) = 0.42, p = 0.69, d = 0.13. Thus, outcomes from the SRL group and the faded SRL group confirm the prediction from the monitoring-affects-control hypothesis. Next, we examined selection behavior following incorrect versus correct partial formats. If a participant incorrectly completes a partial format problem, they may be most likely to select a worked example next so they can study how to compute probabilities. Contrary to the prediction from the monitoring-affects-control hypothesis (see Table 3), following incorrect partial formats, participants selected partial formats more frequently than worked examples, t(25) = 1.94, p = 0.06, d = 0.67. Following correct partial formats, selection of worked examples occurred significantly less often compared to selection of another partial format, t(27) = 8.92, p < 0.001, d = 2.47. One reason why participants may have preferred solving partially worked examples after failing to solve a partial problem (as compared to worked examples) is that they believe this format is as effective for learning as studying a worked example. If so, participants would still be able to monitor their progress yet also would expect doing so to improve their learning. This possibility is explored in Experiment 3 using a post-experimental survey.
3.2.3. Final test accuracy Final test accuracy is reported in Table 2. We first conducted the a priori planned comparisons (one-tailed) based on CLT predictions and prior outcomes. In contrast to Experiment 1, final test performance did not significantly differ between the WEPS and PS groups, t(66) = 0.34, p = 0.37, d = 0.08. Final test performance did not differ significantly between the faded WEPS and WEPS group, t(76) = 0.08, p = 0.47, d = 0.02. For interested readers, exploratory comparisons among other groups are presented in Appendix B. 3.2.4. Summary Replicating Experiment 1, the SRL group did not fully capitalize on the benefits of selecting worked examples, instead preferring problem solving. By contrast, the faded SRL group interleaved problem solving with partial formats, suggesting that when participants elected to receive assistance (which was most often on even-numbered trials early in practice), they preferred partial assistance over fully worked examples. Taken together, the SRL and faded SRL results indicate that participants underutilize worked examples and instead prefer some degree of problem solving autonomy. 4. Experiment 3 In Experiment 3, we further examined participants' control of their learning by replicating four of the five groups used in Experiment 2 (the SRL and faded SRL groups, along with their corresponding comparison groups, the WEPS and faded WEPS groups). In addition, we extended to evaluate participants’ beliefs about the effectiveness of different practice formats by including a post-experimental questionnaire. Measuring beliefs provides insight into which practice formats participants 8
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
Fig. 7. The mean selection percentages of worked examples and problem solving in the SRL group across serial position in the practice phase for Experiment 3.
conditions were asked the same selection-based questions, but the questions were framed in terms of participants’ selection preferences “if they had been given a choice.” The full questionnaire for each group is included in Appendix C.
preferred, which formats they believed were the most effective, and which formats the participants judged as most difficult. 4.1. Method 4.1.1. Participants and design One hundred sixty-four participants from Kent State University participated in exchange for partial course credit. Participants were randomly assigned to WEPS, SRL, faded WEPS, and faded SRL groups. We excluded 25 participants who scored higher than 60% on the pretest and an additional 3 who spent fewer than 10 s on more than half of the pretest, practice, or final test trials. The remaining sample included N = 136 participants, with N = 33 for the WEPS group, N = 36 for the SRL group, N = 36 for the faded WEPS group, and N = 31 for the faded SRL group.
4.2. Results and discussion 4.2.1. Selection behavior in SRL and faded SRL groups The right panel of Fig. 3 provides the percentage of trials selected as worked examples and problem solving for the SRL group and those selected as worked examples, partial formats, and problem solving in the faded SRL group. The selection pattern closely resembles what was observed in Experiment 2. Namely, for the SRL group, the majority of trials were selected for problem solving (72%). For the faded SRL group, selection percentages for problem solving and for partial formats were similar (47% and 40%, respectively), and the worked-example selection percentage was considerably lower (13%). Inspection of Figs. 7 and 8 reveals the time course of these selection behaviors. The preference that the SRL group showed for problem solving was much like that observed in Experiments 1 and 2; the only exception was the absence of the preference reversal on Trial 2 observed in the previous two experiments. Similarly, the preference for problem solving and partial formats evident in the faded SRL group was maintained throughout the practice phase, much like what was observed in Experiment 2; the only exception was that we observed somewhat less alternation early in practice between preference for
4.1.2. Materials and procedure The materials and procedure were identical to those from Experiment 2, with the following exception. After completing the final test, participants completed a post-experimental questionnaire. The questionnaire was self-paced with no time limit and was designed to assess participants' preferences and judgments of effectiveness for the different trial formats they encountered during the practice phase. For the SRL groups, we also asked them to report which trial format they chose to begin the practice phase, which format they selected most often, and whether they ever selected a worked example. The non-SRL
Fig. 8. The mean selection percentages of worked examples, problem solving, and partial formats in the faded SRL group across serial position in the practice phase for Experiment 3.
9
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
Table 4 Responses to post-experimental questionnaire about preference, effectiveness, and difficulty regarding worked example (WE), problem solving (PS), and partial formats (Experiment 3). Preference
SRL Faded SRL WEPS Faded WEPS
Effectiveness
Difficulty
WE (%)
PS (%)
Partial (%)
WE
PS
Partial
WE
PS
Partial
61 6 85 36
39 39 15 14
n/a 55 n/a 50
4.97 5.00 4.94 4.81
4.25 4.32 3.85 4.72
n/a 5.65 n/a 5.64
3.56 2.35 3.82 3.31
5.17 4.23 5.18 4.06
n/a 2.74 n/a 3.08
Note: See Appendix B for copy of post-experimental questionnaire. Effectiveness scale ranged from 1 – not very effective at all to 7 – extremely effective. Difficulty scale ranged from 1 – not very difficult at all to 7 – extremely difficult.
when necessary. Another (non-exclusive) explanation is that participants may have believed that worked examples are less effective for learning and thus should not be used during practice. Furthermore, regardless of whether participants thought some formats were more effective than others, they may have believed that some types of formats are more difficult to use, and it is this perceived difficulty that drove selection. To answer these questions, we analyzed responses to the post-experimental questionnaire across all four groups, which assessed how participants rated the different formats in terms of preference, effectiveness, and difficulty. The results appear in Table 4. Interestingly, a general pattern emerged indicating that formats rated as less difficult were also rated as more effective and were ultimately rated as the preferred format during practice. More specifically, the format that was rated as least difficult by the SRL group—namely, worked examples—tended to be rated as more effective and was preferred the most, compared to problem solving. The WEPS group showed a similar pattern of ratings. One speculation about why solving a full problem was judged as less effective follows from the aforementioned SRL framework: Participants view solving problem as a means to monitor their progress and less so as a mean to learn how to solve a problem. When given the option to select between the three formats in the faded SRL group, the pattern changed somewhat. Although participants in this group rated worked examples as least difficult to use, they still believed that partial formats were the most effective and the most preferred. The faded WEPS group also rated partial formats as most effective and most preferred. Interestingly, the faded SRL participants’ responses about format preference mirrored their actual selection behavior during practice and may explain why they chose to use this format after a failed problem solving attempt. Namely, participants may believe that solving partial problems provides a double benefit: it allows them to monitor their progress and at the same time helps them to learn how to solve the problem. Of less importance, we next analyzed details of selection behavior as reported by participants on the post-experimental questionnaire. Perhaps SRL participants reported preferring worked examples because they misremembered which format they selected most often during practice. As shown in Table 5, however, this was not the case. SRL participants accurately reported selecting problem solving most often,
partial formats versus problem solving in Experiment 3. Despite overall preferences for problem solving in the SRL group, participants may have preferentially selected worked examples on trials following incorrect problem solving. Of the participants who selected problem solving for Trial 1 and answered incorrectly, 56% selected a worked example on Trial 2. Of the participants who selected problem solving for Trial 1 and answered correctly, 33% selected a worked example on Trial 2. Across all 24 trials in the practice phase, the probability of selecting a worked example following incorrect problem solving was significantly greater compared to the probability of selecting a worked example following correct problem solving, t(29) = 8.21, p < 0.001, d = 2.41 (see Table 3). These findings largely replicate the results of SRL groups from Experiments 1 and 2 and confirm the monitoring-affects-control hypothesis. We conducted the parallel set of analyses for the faded SRL group, which are presented in Table 3. Following incorrect problem solving across the practice phase, participants were significantly more likely to select partial formats compared to worked examples, t(26) = 2.89, p = 0.008, d = 0.89. When prior problem solving was correct, partial format and worked example selection rates dropped to near zero and were not significantly different from each other, t(23) = 0.41, p = 0.69, d = 0.12. We also examined selection behavior following incorrect versus correct partial formats. Following incorrect partial formats, participants selected partial formats significantly more often than worked examples, t(21) = 3.02, p = 0.007, d = 1.12. Following correct partial formats, selection of worked examples occurred significantly less often compared to selection of another partial format, t (21) = 5.54, p < 0.001, d = 1.95. These outcomes are consistent with those from Experiment 2 and beg the question, Why did participants prefer solving a partial problem (than studying a worked example) after incorrectly solving a problem? Preliminary answers to this question are available from participants’ responses to the post-experimental questionnaire. 4.2.2. Post-experimental beliefs about example-based learning formats If worked examples have been shown to be effective for learning, why did participants select these formats so infrequently? One explanation from the SRL framework is that participants first solve problems to evaluate their progress and then study a worked example only
Table 5 Responses to post-experimental questionnaire regarding self-reported use of worked example (WE), problem solving (PS), and partial formats (Experiment 3). Which did you select first?
WE
SRL Faded SRL WEPS Faded WEPS
39 23 79 61
PS
61 58 21 14
Which did you select most?
Partial
n/a 19 n/a 25
WE
25 6 48 19
PS
47 39 6 17
Note. Values represent percentages of participants who selected each response.
10
Ever Selected This Format? Partial
n/a 48 n/a 39
Equal
28 6 45 25
WE
Partial
Yes
No
Yes
No
97 74 97 72
3 26 3 28
n/a 77 n/a 81
n/a 23 n/a 19
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
interested in how worked examples can impact judgment accuracy (the monitoring component of metacognitive processes) whereas we investigated how students use worked examples (and problem solving) to guide their learning (a regulatory component of metacognitive processes). Accordingly, we focus the remainder of our General Discussion on students’ regulation of their problem solving. Despite college students' relatively low level of knowledge for solving probability problems, across all studies, those who regulated their learning did not typically begin by studying worked examples. Instead, participants were more likely to select problem solving initially (see Serial Position 1, Figs. 4–8) and throughout practice with relatively limited use of worked examples early in practice following incorrect problem solving attempts. This regulatory approach also seems intuitively appealing, especially given that the normatively effective approach itself is counterintuitive. That is, consistent with the SRL framework described in the introduction, participants may first attempt to solve a problem to evaluate whether they are able to solve it, and then if need be, choose a worked example as a form of restudy. Despite the intuitive appeal of participants' self-regulated approach, they typically underutilized the counterintuitive but normatively effective WEPS schedule. In summary, these outcomes indicate that participants’ selfregulated approach to learning how to solve probability problems is not consistent with the prescriptions from the literature to study examples first and then attempt to solve a problem. In contrast, students’ self-regulation was relatively effective when their options included partially worked examples. In particular, in Experiments 2 and 3, final test performance was superior when a faded schedule was used, regardless of whether the use of partial formats was controlled by the participants (faded SRL group) or the experimenter (faded WEPS group). However, because participants in the faded SRL group chose to problem solve more often than study a partial example, they may not have fully capitalized on the benefits of partial formats. An interesting new finding that will require further investigation is that when participants can use examples in regulating their learning, they perform better when they can use partial examples than when they do not have access to them. Namely, as shown in Table 2, final performance is higher for the SRL groups who had access to partial examples (0.46 and 0.48, Experiments 2 and 3, respectively) than when they had access to fully worked examples alone (0.37 and 0.39, Experiments 2 and 3, respectively)—although these comparisons did not meet conventional statistical significance given the correction for exploratory analyses (see Table B1, Appendix B). Besides further replicating this novel effect, one intriguing avenue for future research would be to evaluate the degree to which participants implement fading effectively to reach mastery of a particular kind of problem, and whether longterm retention is enhanced when students master problem solving when they do (vs. do not) have access to partial examples. Taken together, these outcomes further support the relative efficacy of using faded examples to enhance student problem solving and also suggest students can effectively use this kind of problem solving strategy without training. Other outcomes also indicate that students were using their ongoing monitoring of their performance to regulate their problem solving. As emphasized by the monitoring-affects-control hypothesis (Nelson & Leonesio, 1988), an empirical signature of monitoring-controlled learning is that students will be more likely to attempt to relearn material on which they had answered incorrectly (vs. correctly) during a prior test. In the present context, attempting to solve a problem is analogous to taking a test, and participants' choice of formats after attempting to solve problems conformed to this empirical signature. In particular, as shown in Table 3, participants were much more likely to use a worked example after they had incorrectly solved a problem than when they had correct solved it. Additionally, when participants had
suggesting they remembered their behavior during practice. Similarly, the faded SRL group reported selecting partial formats and problem solving formats most often. In general, SRL and faded SRL participants’ questionnaire responses also mirrored their selection behavior in regards to which format they selected first during practice. Most of these participants indicated starting practice with a problem solving format. Interestingly, however, when asked to indicate which format they would start practice if they had a choice, most WEPS and faded WEPS participants selected worked examples. Additionally, WEPS participants indicated that they would probably select worked examples most often overall. Although speculative, these findings suggest that the perceived utility of worked examples may increase only when participants are exposed to early and regular use of them, like in the WEPS group. 4.2.3. Pretest and practice phase accuracy The mean pretest scores are presented in the rightmost column of Table 1. A one-way ANOVA on pretest scores revealed no significant effect of group, F(3,132) = 0.36, MSE = 0.014, p = 0.77, η2 = 0.009. Practice phase accuracy is reported in Table 2. Holm-Bonferroni corrections tests showed that problem solving was better for the faded WEPS group compared to the WEPS group (p = 0.04). Problem solving was significantly better for the faded WEPS group than the SRL group (p = 0.01). Problem solving for the faded SRL group was not significantly different from the SRL group (p = 0.50) or the faded WEPS group (p = 0.50), or the WEPS (p = 0.78). Problem solving for the WEPS group was not significantly different than the SRL groups (p = 0.78). Accuracy for partial formats problems in the faded WEPS was not significantly different from the faded SRL groups, t(60) = 0.15, p = 0.88, d = 0.04. 4.2.4. Final test accuracy Final test accuracy is reported in Table 2. As predicted, final test performance was significantly greater for the faded WEPS group than the WEPS group, t(67) = 2.38, p = 0.01, d = 0.57 (one-tailed). For interested readers, exploratory comparisons among other groups are presented in Appendix B. 4.2.5. Summary The results of Experiment 3 replicated most key findings from Experiments 1 and 2. The SRL group preferred problem solving over worked examples, and the faded SRL group selected problem solving or partial formats more often than worked examples. On the post-experimental questionnaire, the SRL group rated worked examples as less difficult, more effective, and most preferred, compared to problem solving. Further inquiry indicated that participants in the SRL group were not misremembering which formats they selected, because participants in this group reported selecting problem solving most often. By contrast, the faded SRL group indicated a selection preference for partial formats and problem solving. 5. General discussion Along with Van Merriënboer et al. (2002), the present research is among the first to examine how students regulate their use of worked examples and problem solving while learning principle-based concepts. Accordingly, these studies contribute to a growing literature examining the metacognition of problem solving (for an overview, see Davidson & Sternberg, 1998), some of which has shown that using worked examples can improve students monitoring accuracy (a) when combined with problem solving (Baars, Van Gog, De Bruin, & Paas, 2014) or (b) when worked examples are presented in a partial format (Baars, Visser, Van Gog, De Bruin, & Paas, 2013). This latter research differs from the present work in that Baars and collaborators were
11
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
problem solving, worked examples, and partial formats? Given that only one other investigation has been conducted on self-regulation using worked examples (Van Merriënboer et al., 2002), these (and many other) important issues remain to be addressed. Second, our measure of final test performance (i.e., ability to complete probability problems) is commonly used in the WEPS literature, but future research should consider incorporating more fine-grained measures. In particular, performance is based on students understanding of the principles as well as their ability to accurately perform the appropriate procedures to solve the problem. Given that students could execute a procedure without necessarily understanding the underlying principle, an important avenue for future research on example-based learning of principles would involve including tests that independently assess students’ understanding of the principles (conceptual knowledge) in addition to evaluating their ability to execute the relevant procedures (procedural knowledge). Finally, future research should be designed to explore the degree to which various self-regulation decisions are associated with different learning outcomes. For example, when given the opportunity to select a format for each problem, participants may show unique study decisions (e.g., alternating selection of partial formats with problem solving during practice) that favor learning of principle-based concepts compared to study decisions that are less effective for learning (e.g., selecting all problem solving during practice). Many strategy profiles may arise that could interact with other factors (e.g., working memory capacity, prior knowledge, et cetera) in ultimately explaining final test performance. Large-scale individual differences studies will be needed to address such issues and represent an important new wave of research concerning how effectively students regulate how they learn to solve problems. In summary, the present experiments support several key conclusions. First, when given the options to solve problems or study worked examples, students may underutilize worked examples both with respect to the amount and timing of their use. Second, when students are learning to solve probability problems, incorporating partial formats benefits performance, regardless of whether learning is self regulated or regulated by a computer program. Finally, students appear to make sophisticated choices by studying examples after failed problem-solving attempts and by effectively interleaving partial formats with problem solving.
access to partial formats, they relied on them more than fully worked examples after failing to correctly solve a problem, perhaps because they believed they could simultaneously improve their learning and help monitor this improvement by selecting partial formats. If so, participants’ coordinated selection of the different formats based on problem-solving outcomes (to focus on unlearned problems) reflects a sophisticated strategy aimed at mastering problem solutions. Moreover, even though participants did not initiate practice by choosing a worked example, their self regulation in general was not haphazard and appears to be metacognitively driven as described by SRL frameworks and the monitoring-affects-control hypothesis. The present outcomes also have straightforward implications for instruction. Namely, given that students are responsible for much of their learning outside of class, they may need some guidance to most effectively regulate their learning of principles. For instance, students appeared to be most successful when they had access to partial examples, but in the present case, the partial examples were readily available via the computer program. In settings outside of the laboratory, worked examples may be available, so students should be reminded that the worked examples can be used in a faded format by simply covering the final steps of the solution. Moreover, initial study of worked examples can improve the efficiency of learning, with research showing that WEPS schedules require less practice time overall than schedules beginning with problem solving (e.g., Carroll, 1994; Kalyuga Chandler, Tuovinen, & Sweller, 2001; Sweller & Cooper, 1985; Sweller et al., 1990). Thus, given that students do not consistently study worked examples first, one recommendation is to encourage them to begin with a worked example before attempting to solve a partial example (or full problem). With these implications in mind, however, we must also note two limitations to this investigation: (a) it focused solely on learning to solve probability problems and (b) the outcome measures did not separate understanding principles from accurately executing the appropriate solution algorithms. We consider each in turn. First, we suspect that students' pattern of self regulation will be influenced by multiple factors, such as their prior knowledge and the overall difficulty of the problems. For instance, although most students' performance on the pretest was low (Table 1), it is evident that many students had some exposure to solving probability problems. For less familiar problems, students may be more likely to begin by studying examples before attempting to solve problems. Thus, although the current research provides foundational outcomes, more research will be needed to answer numerous questions: Will college students coordinate their use of these formats in the same manner when attempting to solve other kinds of problems? Will younger students demonstrate sophisticated, goal-oriented problem solving when they have the opportunity to coordinate
Acknowledgment We thank Dr. Andrew Tonge (Chair, Department of Mathematical Sciences at Kent State University) for providing feedback on the probability problems used in this research.
Appendix A Table A1 Examples of probability problems. Relevant Order: When driving to work, Ms. Fast has to pass the same traffic light twice—once in the morning and once in the evening. It is green in 70% of these cases. What is the probability that she can pass through a green light in the morning and has to stop in the evening? Bowling a perfect game (a “300”) is a normal thing for Pro Bowler Janet Jones because it happens 4/5 of the time. What is the probability that, in two games, Janet rolls a 300 only on the second game? Irrelevant Order: A pack of skat cards (32 cards) contains 8 cards of ‘clubs’, ‘spades’, ‘hearts’, and ‘diamonds’, respectively. You take a card, put it back in the pack, and take another card. What is the probability that only one of these cards will be ‘hearts’? Serenity is playing dice with friends in her parents' basement. Each die has eight sides. What is the probability that when she rolls, one die rolls to a 1 and another die rolls to a 5 or a 6?
12
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
Appendix B Table B1 Inferential outcomes and effect sizes for exploratory comparisons on final test performance (for mean values presented in Table 2).
Experiment 1 PS vs. SRL WEPS vs. SRL Experiment 2 PS vs. SRL WEPS vs. SRL fSRL vs. SRL fWEPS vs. fSRL Experiment 3 fSRL vs. SRL fWEPS vs. fSRL
t
df
p
Cohen's d
0.54 1.93
67 66
0.59 0.12
0.13 0.47
0.04 0.37 1.48 0.95
66 74 72 74
1.00 1.00 0.56 1.00
0.01 0.08 0.34 0.22
1.40 0.23
65 65
0.34 0.81
0.34 0.06
Note. See text for inferential outcomes and effect sizes for all planned comparisons. Acronyms for groups: PS = problem solving; WEPS = work examples, problem solving; SRL = self-regulated learning. Prefix of “f” = faded version of that group. P-values are Holm-Bonferroni corrected. See text for details.
Appendix C Table C1 Instructions for post-experimental questionnaire in Experiment 3. Phase 1: (WEPS and SRL groups) 1.) For some problems during the practice phase the answer was provided for you along with all the steps required to derive that answer (i.e., the Example Problem format). On other problems, you were only given the problem question and you had to figure out how to solve it on your own (the Conventional Problem format). Which of these two formats did you prefer? 2.) On a scale of 1–7, with 1 being “Extremely Ineffective” and 7 being “Extremely Effective”, how effective do you think the Example Problem format was for learning how to solve probability problems? 3.) On a scale of 1–7, how effective do you think the Conventional Problem format was for how learning to solve probability problems? 4.) On a scale of 1–7, with 1 being “Extremely Easy” and 7 being “Extremely Difficult”, how difficult was it for you to understand how to solve the probability problems when they were presented in the Example Problem format? 5.) On a scale of 1–7, how difficult was it for you to understand how to solve the probability problems when they were presented in the Conventional Problem format? Phase 1: (Faded WEPS and Faded SRL groups) 1.) For some problems during the practice phase the answer was provided for you along with all the steps required to derive that answer (i.e., the Example Problem format). On other problems, you were only given the problem question and you had to figure out how to solve it on your own (the Conventional Problem format). Finally, some problems were provided with the steps to solve them except for the final calculation (i.e., the Partial Problem format). Which of these two (or three) formats did you prefer? 2.) On a scale of 1–7, with 1 being “Extremely Ineffective” and 7 being “Extremely Effective”, how effective do you think the Example Problem format was for learning how to solve probability problems? 3.) On a scale of 1–7, how effective do you think the Conventional Problem format was for how learning to solve probability problems? 4.) On a scale of 1–7, how effective do you think the Partial Problem format was for how learning to solve probability problems? 5.) On a scale of 1–7, with 1 being “Extremely Easy” and 7 being “Extremely Difficult”, how difficult was it for you to understand how to solve the probability problems when they were presented in the Example Problem format? 6.) On a scale of 1–7, how difficult was it for you to understand how to solve the probability problems when they were presented in the Conventional Problem format? 7.) On a scale of 1–7, how difficult was it for you to understand how to solve the probability problems when they were presented in the Partial Problem format? Phase 2: (WEPS group) 1.) During the practice phase, the computer controlled the order of the presentation of each problem and the format it was presented in. If you had been given a choice, which type of problem format—Example Problem or Conventional Problem—would you have selected for the FIRST problem of the practice phase? Why would you have chosen to begin with that format? 2.) During the practice phase, the computer controlled the order of the presentation of each problem and the format it was presented in. If you had been given a choice, which format option would you have selected the most often, the Example Problem format, the Conventional Problem format, or would you have used the two formats equally often? Why would you have done that? 3.) During the practice phase, the computer controlled the order of the presentation of each problem and the format it was presented in. If you had been given a choice, would you have ever selected an Example Problem format at any point in time during the practice phase? If so, when during the practice phase would you have selected an Example Problem format? Why? If not, why not? Phase 2: (for SRL group) 1.) For the first problem of the practice phase, did you decide to begin with an Example Problem or a Conventional Problem? Why did you choose to begin with this format?
13
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
2.) During the practice phase, which format option did you select the most often, the Example Problem format, the Conventional Problem format, or did you select them equally often? Why did you do that? 3.) Did you ever select an Example Problem format? If so, when during the practice phase did you tend to select the Example Problem format? Why? If not, why not? Phase 2: (Faded WEPS group) 1.) During the practice phase, the computer controlled the order of the presentation of each problem and the format it was presented in. If you had been given a choice, which type of problem format—Example Problem, Conventional Problem, or Partial Problem—would you have selected for the FIRST problem of the practice phase? Why would you have chosen to begin with that format? 2.) During the practice phase, the computer controlled the order of the presentation of each problem and the format it was presented in. If you had been given a choice, which format option would you have selected the most often, the Example Problem format, the Conventional Problem format, the Partial Problem format, or would you have used the three formats equally often? Why would you have done that? 3.) During the practice phase, the computer controlled the order of the presentation of each problem and the format it was presented in. If you had been given a choice, would you have ever selected an Example Problem format at any point in time during the practice phase? If so, when during the practice phase would you have selected an Example Problem format? Why? If not, why not? 4.) During the practice phase, the computer controlled the order of the presentation of each problem and the format it was presented in. If you had been given a choice, would you have ever selected a Partial Problem format at any point in time during the practice phase? If so, when during the practice phase would you have selected a Partial Problem format? Why? If not, why not? Phase 2: (for Faded SRL group) 1.) For the first problem of the practice phase, did you decide to begin with an Example Problem, a Conventional Problem, or a Partial Problem? Why did you choose to begin with this format? 2.) During the practice phase, which format option did you select the most often, the Example Problem format, the Conventional Problem format, the Partial Problem format, or did you select them equally often? Why did you do that? 3.) Did you ever select an Example Problem format? If so, when during the practice phase did you tend to select the Example problem format? Why? If not, why not? 4.) Did you ever select a Partial Problem format? If so, when during the practice phase did you tend to select the Partial Problem format? Why? If not, why not?
Holm, S. (1979). A simple sequential rejective method procedure. Scandinavian Journal of Statistics, 6, 65–70. Kalyuga, S. (2007). Expertise reversal Effect and its implications for learner-tailored instruction. Educational Psychology Review, 19, 509–539. Kalyuga, S., Chandler, P., Tuovinen, J., & Sweller, J. (2001). When problem solving is superior to studying worked examples. Journal of Educational Psychology, 93, 579–588. Kalyuga, S., & Sweller, J. (2004). Measuring knowledge to optimize cognitive load factors during instruction. Journal of Educational Psychology, 96, 558–568. Kissane, M., Kalyuga, S., Chandler, P., & Sweller, J. (2008). The consequences of fading instructional guidance on delayed performance: The case of financial services training. Educational Psychology, 28, 809–822. Kornell, N., & Bjork, R. A. (2007). The promise and perils of self-regulated study. Psychonomic Bulletin & Review, 14, 219–224. Leppink, J., Paas, F., Van Gog, T., Van Der Vleuten, C. P. M., & Van Merriënboer, J. J. G. (2014). Effects of pairs of problems and examples on task performance and different types of cognitive load. Learning and Instruction, 30, 32–42. Lishner, D. A. (2015). A concise set of core recommendations to promote the dependability of psychological research. Review of General Psychology, 19, 52–68. Maner, J. K. (2014). Let's put our money where our mouth is. If authors are to change their ways, reviewers (and editors) must change with them. Perspectives in Psychological Science, 9, 343–351. Metcalfe, J., & Kornell, N. (2005). A Region of Proximal Learning model of study time allocation. Journal of Memory and Language, 52, 463–477. Murphy, M. D., Schmitt, F. A., Caruso, M. J., & Sanders, R. E. (1987). Metamemory in older adults: The role of monitoring in serial recall. Psychology and Aging, 2, 331–339. Mwangi, W., & Sweller, J. (1998). Learning to solve compare word problems. The effect of example format and generating self-explanations. Cognition and Instruction, 16, 173–199. Nelson, T. O., & Leonesio, R. J. (1988). Allocation of self-paced study time and the 'laborin-vain effect'. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 476–486. Nelson, T. O., & Narens, L. (1990). Metamemory: A theoretical framework and new findings. In G. H. Bower (Vol. Ed.), The psychology of learning and motivation: Vol 26, (pp. 125–141). San Diego, CA: Academic Press. Nelson, T. O., & Narens, L. (1994). Why investigate metacognition? In J. Metcalfe, & A. P. Shimamura (Eds.). Metacognition: Knowing about knowing (pp. 1–25). Cambridge, MA: MIT Press. Paas, F., Renkl, A., & Sweller, J. (2003). Cognitive load theory and instructional design: Recent developments. Educational Psychologist, 38, 1–4. Paas, F., Van Gog, T., & Sweller, J. (2010). Cognitive load theory: New conceptualizations, specifications, and integrated research perspectives. Educational Psychology Review, 22, 115–121. Reisslein, J., Atkinson, R. K., Seeling, P., & Reisslein, P. (2006). Encountering the expertise reversal effect with a computer-based environment on electrical circuit analysis. Learning and Instruction, 16, 92–103. Renkl, A., Atkinson, R. K., & Große, C. S. (2004). How fading worked solution steps works—a cognitive load perspective. Instructional Science, 32, 59–82.
References Atkinson, R. K., Derry, S. J., Renkl, A., & Wortham, D. (2000). Learning from examples: Instructional principles from the worked examples research. Review of Educational Research, 70, 181–214. Baars, M., Van Gog, T., De Bruin, A., & Paas, F. (2014). Effects of problem solving after worked example study on primary school children's monitoring accuracy. Applied Cognitive Psychology, 28, 382–391. Baars, M., Visser, S., Van Gog, T., De Bruin, A., & Paas, F. (2013). Completion of partially worked-out examples as a generation strategy for improving monitoring accuracy. Contemporary Educational Psychology, 38, 395–406. Carroll, W. M. (1994). Using worked examples as an instructional support in the algebra classroom. Journal of Educational Psychology, 86, 360–367. Cohen, J. (1988). Statistical power analysis for the behavioural sciences. New York, NY: Academic. Cooper, G., & Sweller, J. (1987). Effects of schema acquisition and rule automation on mathematical problem-solving transfer. Journal of Educational Psychology, 79, 347–362. Cortina, J. M., & Nouri, H. (2000). Effect size for ANOVA designs. Thousand Oaks, CA: Sage. Davidson, J., & Sternberg, R. (1998). Smart problem solving: How metacognition helps. In D. J. Hacker, J. Dunlosky, & A. C. Graesser (Eds.). Metacognition in educational theory and practice (pp. 47–68). Mahwah, NJ, US: Lawrence Erlbaum Associates Publishers. Dunlosky, J., & Ariel, R. (2011). Self-regulated learning and the allocation of study time. B. Ross (Ed.). Psychology of Learning and Motivation, 54, 103–140. Durkin, K., & Rittle-Johnson, B. (2012). The effectiveness of using incorrect examples to support learning about decimal magnitude. Learning and Instruction, 22(3), 206–214. http://dx.doi.org/10.1016/j.learninstruc.2011.11.001. Faul, F., Erdfelder, E., Lang, A. G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavioral Research Methods, 39, 175–191. Fyfe, E. R., DeCaro, M. S., & Rittle-Johnson, B. (2015). When feedback is cognitivelydemanding: The importance of working memory capacity. Instructional Science, 43(1), 73–91. Gaetano, J. (2013). Holm-Bonferroni sequential correction: An EXCEL calculator (1.2) [Microsoft Excel workbook]. http://dx.doi.org/10.13140/RG.2.1.3920.0481. Retrieved from https://www.researchgate.net/publication/242331583_HolmBonferroni_Sequential_Correction_An_EXCEL_Calculator_-_Ver._1.2. Greene, J. A., Dellinger, K. R., Tüysüzoğlu, B. B., & Costa, L. (2013). A two-tiered approach to analyzing self-regulated learning data to inform the design of hypermedia learning environments. In R. Azevedo, & V. Aleven (Eds.). International handbook of metacognition and learning technologies (pp. 117–128). New York, NY: Springer. Groβe, C. S., & Renkl, A. (2007). Finding and fixing errors in worked examples: Can this foster learning outcomes? Learning and Instruction, 17(6), 612–634. Hartwig, M. K., & Dunlosky, J. (2012). Study strategies of college students: Are selftesting and scheduling related to achievement? Psychonomic Bulletin & Review, 19, 126–134.
14
Learning and Instruction xxx (xxxx) xxx–xxx
N.L. Foster et al.
Science, 12, 257–285. Sweller, J. (1994). Cognitive load theory, learning difficulty, and instructional design. Learning and Instruction, 4, 295–312. Sweller, J., Chandler, P., Tierney, P., & Cooper, M. (1990). Cognitive load as a factor in the structuring of technical material. Journal of Experimental Psychology: General, 119, 176–192. Sweller, J., & Cooper, G. A. (1985). The use of worked examples as a substitute for problem solving in learning algebra. Cognition and Instruction, 2, 59–89. Van Gog, T., & Kester, L. (2012). A test of the testing effect: Acquiring problem-solving skills from worked examples. Cognitive Science, 36, 1532–1541. Van Gog, T., Kester, L., & Paas, F. (2011). Effects of worked examples, example-problem, and problem-example pairs on novices‟ learning. Contemporary Educational Psychology, 36, 212–218. Van Gog, T., Paas, F., & Van Merriënboer, J. J. G. (2006). Effects of process-oriented worked examples on troubleshooting transfer performance. Learning and Instruction, 16, 154–164. Van Merriënboer, J. J. G., Schuurman, J. G., de Croock, M. B. M., & Paas, F. G. W. C. (2002). Redirecting learners' attention during training: Effects on cognitive load, transfer test performance and training efficiency. Learning and Instruction, 12, 11–37. Ward, M., & Sweller, J. (1990). Structuring effective worked examples. Cognition and Instruction, 7, 1–39. Winne, P. H., & Hadwin, A. F. (1998). Studying as self-regulated learning. In D. J. Hacker, J. Dunlosky, & A. C. Graesser (Eds.). Metacognition in educational theory and practice (pp. 277–304). Hillsdale, NJ: Lawrence Erlbaum.
Renkl, A., Atkinson, R. K., Maier, U. H., & Staley, R. (2002). From example study to problem solving: Smooth transitions help learning. Journal of Experimental Education, 70, 293–315. Renkl, A., Berthold, K., Große, C. S., & Schwonke, R. (2013). Making better use of multiple representations: How fostering metacognition can help. In R. Azevedo, & V. Aleven (Eds.). International handbook of metacognition and learning technologies (pp. 397–408). New York, NY: Springer. Retnowati, E., Ayres, P., & Sweller, J. (2010). Worked example effects in individual and group work settings. Educational Psychology, 30, 349–367. Rourke, A., & Sweller, J. (2009). The worked-example effect using ill-defined problems: Learning to recognise designers' styles. Learning and Instruction, 19, 185–199. Schmidt, S. (2009). Shall we really do it again? The powerful concept of replication is neglected in the social sciences. Review of General Psychology, 13, 90–100. Schmidt-Weigand, F., Hanze, M., & Wodzinski, R. (2009). Complex problem solving and worked examples: The role of prompting strategic behavior and fading-in solution steps. German Journal of Educational Psychology, 23, 129–138. Schwonke, R., Renkl, A., Krieg, C., Wittwer, J., Aleven, V., & Salden, R. (2009). The worked-example effect: Not an artifact of lousy control conditions. Computers in Human Behavior, 25, 258–266. Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. New York, NY: Houghton Mifflin Company. Simons, D. J. (2014). The value of direct replication. Perspectives on Psychological Science, 9, 76–80. Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive
15