Types of reasoning required in university exams in mathematics

Types of reasoning required in university exams in mathematics

Journal of Mathematical Behavior 26 (2007) 348–370 Types of reasoning required in university exams in mathematics Ewa Bergqvist Department of Mathema...

259KB Sizes 5 Downloads 124 Views

Journal of Mathematical Behavior 26 (2007) 348–370

Types of reasoning required in university exams in mathematics Ewa Bergqvist Department of Mathematics and Mathematical Statistics, Ume˚a University, SE-90187 Ume˚a, Sweden

Abstract Empirical research shows that students often use reasoning founded on copying algorithms or recalling facts (imitative reasoning) when solving mathematical tasks. Research also indicate that a focus on this type of reasoning might weaken the students’ understanding of the underlying mathematical concepts. It is therefore important to study the types of reasoning students have to perform in order to solve exam tasks and pass exams. The purpose of this study is to examine what types of reasoning students taking introductory calculus courses are required to perform. Tasks from 16 exams produced at four different Swedish universities were analyzed and sorted into task classes. The analysis resulted in several examples of tasks demanding different types of mathematical reasoning. The results also show that about 70% of the tasks were solvable by imitative reasoning and that 15 of the exams could be passed using only imitative reasoning. © 2007 Elsevier Inc. All rights reserved. Keywords: Mathematical reasoning; Algorithms; Creativity; Assessment; Tests; Calculus exams; University level mathematics

1. Introduction The purpose of this study is to examine the reasoning that Swedish university students in mathematics have to perform in order to solve exam tasks and pass exams. The study is founded on empirical research indicating that students primarily choose1 imitative reasoning,2 i.e. reasoning founded on recalling answers or remembering algorithms, e.g. Tall (1996), Palm (2002), Lithner (2003). The use of algorithms is a vital part of learning mathematics and can provide a foundation for understanding. It can also relieve the mathematician or the student of tedious calculations that demand an unnecessary large amount of time. However, a too narrow focus on algorithms and routine-task solving might cut students off from other parts of the mathematical universe that also need attention, e.g. problem solving and deductive reasoning. Several researchers have shown how students that work with algorithms seem to focus solely on remembering the steps, and many argue that this weakens the students’ understanding of the underlying mathematics, see e.g. Leinwand (1994), McNeal (1995), Kamii and Dominick (1997). In a case study by Brayer Ebby (2005), the author followed a student during three school years, from grade two to grade four of elementary school. Brayer Ebby concludes from her findings that “for some children, learning to use the algorithm procedurally actually prevents them from learning more powerful mathematical concepts” (p. 85). Similarly, Pesek and Kirshner (2000) conclude from a study that initial rote learning of a concept may interfere with later relational learning. Lithner (2004) argues that students who spend most of their time copying solutions to mathematical routine tasks (from examples in the E-mail address: [email protected]. In this context the word choose does not necessarily mean that the students make a conscious and well considered selection between methods, but just as well that they have a subconscious preference for certain types of procedures. 2 The concepts of imitative and creative reasoning are thoroughly defined in Section 2. 1

0732-3123/$ – see front matter © 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.jmathb.2007.11.001

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

349

textbook) can only develop resources connected to surface mathematical areas. These observations and arguments show a connection between a narrow focus on imitative reasoning and students’ difficulties when learning mathematics. Most of the international results mentioned concerns children in school years K-9, often in situations concerning arithmetic calculations. The situation at the university level, especially in Sweden, is not thoroughly studied, and there are still many important questions to be answered. One question related to this situation is why students focus on this type of limitedly successful reasoning. Hiebert (2003) argues that students learn what they are given the opportunity to learn and that the students’ learning is connected to the activities and processes they are engaged in. It is therefore important to examine the different types of reasoning that the students perform during their studies. Empirical studies indicate that the students encounter creative reasoning to a very low extent, e.g. in the textbook (Lithner, 2004). Teachers also often argue that relational instruction is more time-consuming than instrumental instruction (Hiebert & Carpenter, 1992; Skemp, 1978). In addition, students are not required to perform more creative types of reasoning. Flemming and Chambers (1983) analyzed 8800 teachermade test questions and showed that 80% of the tasks were at the “knowledge-level.” Similarly, Senk, Beckmann, and Thompson (1997) coded more than 100 teacher-made tests and classified items as requiring reasoning if they demanded justification, explanation, or proof. The analysis showed that, in average, 5% of the test items demanded reasoning (varying from 0 to 15%). The authors also report that most of the analyzed test items tested low level thinking, i.e. that the items either tested the students’ previous knowledge, or were possible for the students to solve in one or two steps. The percentage of tasks requiring more creative types of reasoning have been examined in Swedish upper secondary school tests (Palm, Boesen, & Lithner, 2005). The results show that the students were required to perform creative reasoning in between 7% and 24% of the tasks depending on study program. All in all, this indicates that students primarily get the opportunity to learn imitative reasoning and that they are not required to perform more creative types of reasoning. There are several reasons why exams in general, as a part of students’ environment when learning mathematics, are important to study. The exams provide occasions when students, usually very actively, engage in mathematics by solving tasks, and several studies show that assessments, in general, influence the way students study (Kane, Crooks, & Cohen, 1999). The important role of the course exams at university level in Sweden is described in Section 3. The knowledge and skills that a passing student will be equipped with after the course are directly related to what is measured by the exam. This “threshold”-function of the exams shows that their design has practical consequences for both teachers and students. Also, on a more abstract level, the exams assess what the teachers – professional mathematicians – see as relevant contents and competences. The design of the exams may therefore affect the students’ beliefs concerning mathematics, e.g. what type of reasoning is important within mathematics. The need for more knowledge concerning students’ reasoning at university level, combined with the crucial role of the exams at this level, lead to several possible and important research questions. The aim of this study is to examine the reasoning that Swedish university students in mathematics have to perform in order to solve exam tasks and pass exams. The first research question, therefore, concerns the types of reasoning that the students have to perform when solving exam tasks. The second research question concerns the extent to which it is possible for the students to solve tasks by applying imitative reasoning, i.e. how successful can they be without using more creative types of reasoning? The research questions are examined through a classification of more than 200 introductory calculus exam tasks and an analysis of their solutions. The classification is founded on a research framework presented in the following section (Lithner, in press). The results of this study may contribute to the knowledge concerning the students’ learning environments especially concerning assessment and the connection to learning difficulties. The study is a part of a large project that spans several years and is being carried out by the research group in mathematics education at the Department of Mathematics and Mathematical Statistics at Ume˚a University in Sweden. Different subprojects treat the characteristics, causes, and measures to be taken concerning learning difficulties in mathematics at different levels of education. 2. The conceptual framework There are several theoretical comprehensive frameworks that describe mathematical competences, e.g. NCTM (2000), Niss and Jensen (2002), but not many frameworks specifically aim at characterizing mathematical reasoning. In fact, the concept of reasoning is seldom explicitly defined, although it is often used to denote some kind of highquality thinking process. Lithner (in press) presents a conceptual framework that particularly considers and defines

350

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

Fig. 1. Overview of reasoning types in the framework.

different types of mathematical reasoning. This framework is appropriate for this study for several reasons. First, it provides a basis for the design of research studies concerning mathematical reasoning. Second, it uses a very wide definition of reasoning. Many similar frameworks primarily consider strict proofs, which would be too narrow for this study. Third, the framework is firmly founded in empirical data, and explicitly defines the used concepts at a satisfactory level. The framework (Lithner, in press) defines reasoning as “the line of thought adopted to produce assertions and reach conclusions in task solving” (p. 3). Lithner points out that, with this definition, a line of thought is called reasoning even if it is incorrect, as long as there is some kind of sensible (to the reasoner) reason behind the thinking. Within the framework, the quality of a specific case of reasoning is described by various reasoning characteristics. Lithner remarks that reasoning can either be seen as thinking processes, as the result of such processes, or as both. Since the framework aims at providing a basis for analysis of data, reasoning is mainly regarded as the result of the involved thinking processes, i.e. the sequence of reasoning that starts in a task and ends in an answer. The framework is based on many analyses of sequences of reasoning, several of which are presented in empirical studies (Lithner, 2000, 2003, 2004). Each analysis particularly identifies strategy choices and strategy implementations, concepts used to define the different types of reasoning presented in the following subsections (and explicitly defined in Lithner, in press). During the empirical studies, two basic types of mathematical reasoning were identified and defined: creative mathematically founded reasoning and imitative reasoning. An overview of these and the additional reasoning types in the framework is presented in Fig. 1. The term memorized reasoning might be considered problematic since the concept of reasoning is, as mentioned, in the research literature often used to denote some kind of high-quality chain of thought. Since Lithner (in press) defines reasoning as any way of thinking that concerns task solving, it does not have to be based on formal deductive logic, and can denote even as simple procedures as recalling facts (i.e. memorized reasoning). 2.1. Creative mathematically founded reasoning Creative mathematically founded reasoning is in the framework seen as a product of creative mathematical thinking. Such creative thinking processes are in this context characterized by flexibility, the admission of different approaches, and that they are not hindered by fixation (cf. Haylock, 1997; Silver, 1997). Creativity is, in Lithner (in press), not primarily associated with geniality or superior thinking, but with the creation of new and reasonably well-founded task solutions. The reasoning characteristics specific for creative mathematically founded reasoning are therefore novelty, plausibility, and mathematical foundation.

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

351

To be called creative mathematically founded reasoning, the reasoning sequence in a solution has to fulfill the following conditions. The reasoning sequence: • is new to the reasoner (novelty), • contains strategy choices and/or implementations supported by arguments that motivates why the conclusions are true or plausible (plausibility), and that are anchored in intrinsic mathematical properties of the components involved in the reasoning (mathematical foundation). 2.1.1. An example of a task demanding CR Creative mathematically founded reasoning is, for simplicity, usually only denoted creative reasoning and is abbreviated CR. 2.1.2. Plausibility Inspired by P´olya (1954) and his discussion of strict reasoning vs. plausible reasoning, plausibility is, in the framework, used to describe reasoning that is supported by arguments that are not necessarily as strict as in proof. The quality of the reasoning is connected to the context in which it is produced. A lower secondary school student that argues for an equality by producing several numerical examples confirming its validity might be seen as performing quite high quality reasoning, while the same reasoning produced by a university student would be considered of poor quality. 2.1.3. Mathematical foundation The arguments used to show that a solution is plausible, can be more or less well founded. The framework defines task components to be objects (e.g. numbers, functions, and matrices), transformations (e.g. what is being done to an object), and concepts (e.g. a central mathematical idea built on a related set of objects, transformations, and their properties). A component is then seen to have a mathematical property if the property is accepted by the mathematical society as correct. The framework further distinguishes between intrinsic properties – central to the problematic situation – and surface properties – having no or little relevance – of a particular context. An example: Try to determine if an attempt to bisect an angle is successful. The visual appearance of the size of the two angles is a surface property of these two components while the formal congruency of the triangles is an intrinsic property. The empirical studies showed that one of the main reasons for students’ difficulties was their focus on surface properties (Lithner, 2003). Schoenfeld (1985) obtained similar results: Novices often use ‘naive empiricism’ to verify geometrical constructions, i.e. a construction is good if it ‘looks good’ (as in the example with the bisected angle mentioned above). In the reasoning framework, a solution has mathematical foundation if the argumentation is based on intrinsic mathematical properties of the involved components. Whether a task demands CR of students or not is directly connected to what type of tasks the students have practiced solving. Suppose that a group of students have studied the concept of continuity in a calculus course. They have seen examples of both continuous and discontinuous functions, they have studied the textbook theory definitions, and they have been to lectures listening to more informal descriptions of the concept. The students have also solved several exercises asking them to determine whether a function is continuous or not. But if the students have never encountered a task asking the solver to construct an example of a function with certain continuity properties, such a task would demand that they use creative reasoning in order to solve it. A CR task, for these particular students, could then be: “Give an example of a function f that is right continuous, but not left continuous, at x = 3.” 2.2. Imitative reasoning Imitative reasoning can be described as a type of reasoning built on copying task solutions, e.g. by looking at a textbook example or through remembering an algorithm or an answer. An answer is defined as “a sufficient description of the properties asked for in the task.” A solution to a task is an answer together with arguments supporting the correctness of the answer. Both the answer and the solution formulations depend on the particular situation where the task is posed, e.g. in a textbook exercise, in an exam task, or in a real life situation. Several different types of imitative reasoning have been characterized during empirical studies, and they are summarized in the two main classes: memorized and algorithmic reasoning. These two main classes will be presented

352

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

here, together with two versions of algorithmic reasoning: familiar algorithmic reasoning and delimiting algorithmic reasoning. These versions are relevant because they are possible to use during the exams examined in this study. 2.2.1. Memorized reasoning The reasoning in a task solution is denoted memorized reasoning if it fulfills the following conditions: • The strategy choice is founded on recalling a complete answer by memory. • The strategy implementation consists only of writing down (or saying) the answer. Typical tasks solvable by memorized reasoning are tasks asking for facts, e.g. “What is the name of the point of intersection of the x- and y-axes in a coordinate system?” A task asking the students to prove an advanced theorem is only possible for the students to solve using memorized reasoning if they are informed in advance that the proof might be asked for during the exam. An important interpretation of the definition is that the different parts of a solution based on memorized reasoning could mistakenly be written down in the wrong order, since the different parts do not depend on each other. This is sometimes visible e.g. when students have memorized a proof of a theorem and get the order of the statements mixed up in the presentation. Even long and quite complicated proofs, like the proof of the Fundamental Theorem of Calculus (FTC), are possible to solve through memorized reasoning. In a study described in the reasoning framework, many students that managed to correctly prove the FTC during an exam were afterwards only able to explain a few of the following six equalities included in the proof:    x x+h F (x + h) − F (x) 1  F (x) = lim f (t) dt − f (t) dt = lim h→0 h→0 h h a a  1 x+h 1 = lim f (t) dt = lim hf (c) = lim f (c) = f (x) c→x h→0 h x h→0 h The students’ inability to explain these equalities, most of which are elementary in comparison to the argumentation within the proof, imply that they memorized the proof and did not understand it. Memorized reasoning is abbreviated MR. 2.2.2. Algorithmic reasoning An algorithm, according to Lithner (in press), is a “set of rules that if followed will solve a particular task type.” An example is the standard formula for solving quadratic equation. The definition also includes clearly defined sequential procedures that are not purely calculational, e.g. using a graphing calculator to approximate the solution to an equation through zooming in on the intersection. Even though these procedures are to some extent memorized, there are several differences between algorithmic and memorized reasoning. The most apparent difference is that a student performing memorized reasoning has completely memorized the solution, while a student using algorithmic reasoning memorizes the difficult steps of a procedure and then performs the easy ones. That algorithmic reasoning is dependent on the order of the steps in the solution also separates algorithmic from memorized reasoning where different parts could mistakenly be written down in the wrong order. The reasoning in a task solution is denoted algorithmic reasoning if it fulfils the following conditions: • The strategy choice is founded on recalling by memory a set of rules that will guarantee that a correct solution can be reached. • The strategy implementation consists of carrying out trivial3 (to the reasoner) calculations or actions by following the set of rules. 3

The word trivial is basically used to denote lower level mathematics, i.e. standard mathematical contents from the previous stages of the students’ studies.

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

353

Algorithmic reasoning is a reliable solution method in cases of routine task solving when the reasoner has encountered and used the algorithm several times and is completely sure of what to do. Still, the studies mentioned earlier indicate that students also try, unsuccessfully, to use algorithmic reasoning in problem solving situations. Since professional mathematicians frequently use algorithms, using algorithmic reasoning is not a sign of lack of understanding. The use of algorithms saves time for the reasoner and minimizes the risks for miscalculations, since the strategy implementation only consists of carrying out trivial calculations. Algorithmic reasoning is, however, possible to perform without any understanding of the intrinsic mathematics. Algorithmic reasoning is abbreviated AR. 2.2.2.1. Familiar algorithmic reasoning. In familiar algorithmic reasoning (FAR), the task is identified as being of a familiar type, i.e. belonging to a familiar set of tasks that all can be solved by the same known algorithm. An uncomplicated version of FAR, described by Hegarty, Mayer, and Monk (1995), is when elementary school pupils use a “key word strategy.” This happens when a number of textbook tasks contain either the word “more” or the word “less” and the student is supposed to use either the addition algorithm or the subtraction algorithm: “Tom has 4 marbles and Lisa has 3 more than Tom. How many marbles does Lisa have?” If a student’s choice between the two algorithms is solely based on what word that appears in a particular task, the student is using the key word strategy and FAR. 2.2.2.2. Delimiting algorithmic reasoning. A student use delimiting algorithmic reasoning (DAR) chooses the algorithm from a set of algorithms delimited, by the reasoner, through the included algorithms’ surface property relations to the task. A detailed example is presented in Bergqvist, Lithner, and Sumpter (2003), where a student, Sally, tries to solve the task “Find the largest and smallest values of the function y = 7 + 3x − x2 on the interval [−1, 5].” Sally makes several strategy choices. First, she differentiates, obtaining y = 3 − 2x, and solves 3 − 2x = 0. Unhappy with only obtaining one solution to the task Sally uses the calculator’s minimum-function and tries, unsuccessfully, to find the function’s minimum. She moves on to the calculator’s table-function but uses integer steps and cannot handle the resulting numbers. Finally, Sally chooses to solve the equation 7 + 3x − x2 = 0 and is pleased to obtain two different values that answer the task. Every choice Sally makes is based on the different algorithms having surface connections with the task at hand. The algorithms are not chosen completely at random, but randomly from a delimited set of possible algorithms. She does not evaluate or analyze the outcomes, and as soon as she finds that an algorithm does not work, she chooses a new one. 3. The Swedish system in short Sweden has approximately 9 million inhabitants. There are a little more than 40 university colleges of which 12 are universities (which in Sweden means that the government has granted the university the right to issue degrees at the graduate level) that provide courses in mathematics. The exams in this study each completely determined the students’ grades in the courses that they were given. The grades were failed, passed, and passed with distinction. The maximum numbers of points on the exams were 24 (10 exams), 25 (3 exams), and 30 (3 exams) and the grades passed and passed with distinction usually corresponded to approximately 50% and 75% of the maximum points, respectively. Each exam consisted of 8–10 main tasks with several of these tasks being divided into subtasks. Students in all courses had a maximum time available of approximately 6 h. This setting naturally varies between universities, courses, and teachers, but is, in fact, quite common. Students’ exam results usually completely determine if they get credit for a course or not. The students need the credits to get their academic degrees, but the credits are also important from an economical perspective. Students need the credits to be granted continued government funded post-secondary student aid, which is the main source of income for many of them. There is also an economical aspect from the teachers’ point of view. The percentage of students that pass an exam is relevant when it comes to each department’s funding, and the teachers’ own future may, at least in the long run, depend on the outcomes of the exams. 4. Research questions The background can be summarized as follows. According to previous research, students’ understanding of mathematics may be affected negatively by a too large focus on imitative reasoning. Still, they seem to primarily choose to

354

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

use imitative reasoning. One possible reason is that students mainly get the opportunity to learn imitative reasoning. It is therefore important to examine the reasoning that the students have to perform in order to solve exam tasks. The following research questions are posed with typical introductory calculus students in mind, i.e. students that have participated in the seminars, have studied the theory as well as examples in the textbook, and have solved the recommended exercises. Both questions concern exams and students from introductory calculus courses at Swedish universities. Research question 1 (RQ1):

In what ways can students solve exam tasks using imitative and creative reasoning?

What types of solutions to exam tasks possible to solve with imitative reasoning and exam tasks demanding creative reasoning are there? To determine the quality of the reasoning that the students have to produce (in order to solve exam tasks) the tasks are classified and the solutions are sorted according to e.g. length and complexity. Both the classification method and the method of analysis for the task solutions are presented in the following section. Research question 2 (RQ2): To what extent is it possible for the students to solve exam tasks using imitative reasoning based on surface properties of the tasks? Is it possible for a student to focus solely on imitative reasoning and still solve a large percentage of the exam tasks—perhaps even to pass the exam? The classification of exam tasks will be the basis for the quantitative analysis connected to this research question. 5. Method 5.1. Collecting data The data were collected from all introductory calculus courses offered at four different Swedish universities during the academic year of 2003/2004. The courses were mainly designed for students aiming at having mathematics as a major, not e.g. students in teacher education programs or engineering students. Ume˚a University was chosen to be one of the universities in the study since the author is employed there. The other three were randomly selected by drawing lots among the remaining Swedish universities. There are large as well as small universities among those that were selected, and they are spread geographically throughout the country. The results of the study may not be possible to completely generalize to all mathematics exams at Swedish universities, but it is probable that the results are fairly representative for the introductory calculus courses during the academic year in question. At three universities the courses corresponded to five credits, which is one eighth of a Swedish academic year. These courses focused on the concept of differentiation and related topics, but the content matter varied slightly. At the fourth university, the course corresponded to ten credits, and the course contents covered not only differentiation and related topics, but also integration and related topics. The data used in analysis consist mainly of three different types of written information: exams, textbooks, and material handed out by the teachers. The handouts contained, in most cases, recommended exercises and a list of the theorems and proofs that students were supposed to learn during the course. In one course the handout also contained extra exercises and a collection of mathematical formulas. A total of 16 exams, 6 regular and 10 re-tests, consisting of 212 tasks (or subtasks) were analyzed in the study. 5.2. Method of analysis The analysis consists of two main parts: classification of the tasks using a theoretical classification tool and sorting of task solutions within each task class. The following subsections present the classification tool, how the classification procedure is carried out, and the method of analysis of task solutions. 5.2.1. Classification of tasks The classification tool consists mainly of a procedure designed to determine what type of reasoning is required of the students in a course to solve an exam task (Palm et al., 2005). A task that is possible to solve with imitative reasoning for some students, might require creative reasoning from others, depending on how familiar the task and its solution are to

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

355

the students. The classification tool therefore consists of a systematic step-by-step procedure to determine how familiar an exam task is to the students taking the course connected to the particular exam. The procedure is presented in the following subsection, but first, the different task classes used in the classification and the validity of the classification tool are discussed. The classification tool is based on the research framework presented in Section 2 (Lithner, in press), and distinguishes between tasks possible to solve with imitative reasoning (IR) and tasks requiring creative reasoning (CR). Since the research questions consider the extent to which the tasks are possible to solve using IR, the tasks that required CR were sorted into two different classes, depending on how much creative reasoning they required. If a task is almost completely solvable using IR and requires CR only in a local modification of the algorithm, the task is said to require local creative reasoning (LCR). The task “Let f(x) = (x2 − 1)2/3 and draw a graph of the function f” is a possible example of an LCR task. A student who has mastered the global algorithm connected to drawing graphs but has not previously applied the algorithm on a function that is defined at a point where the derivative is not defined, will have to use some local creative reasoning to handle the new situation. The reasoning is based on a global algorithm and requires creative reasoning only locally. If a task has no solution that is globally based on IR and therefore demands CR all the way through, it is said to require global creative reasoning (GCR). Small parts of the solution to a GCR task could still be based on local IR. An example of a GCR task could be: “Give an example of a function f that is right continuous, but not left continuous, at x = 3” under the circumstances described in Section 2.1.3. Since LCR tasks are possible to solve partly, or almost completely, using IR, the separation between LCR and GCR is necessary in order to capture to what extent IR is possible to use when solving the exam tasks (RQ2). The classification tool thus separates between MR, AR, LCR, and GCR tasks, and does so by determining how familiar an exam task is to the students in the course. The familiarity of a task is connected to the students’ learning experiences. Since it is not possible to determine the students’ complete learning experiences – every exercise solved, every lecture attended, every piece of information offered, etc. – it is necessary to simplify the actual situation. As mentioned in Section 4, the research questions are posed with typical introductory calculus students in mind. The classification procedure is therefore mainly based on the contents of the course textbook and includes a comparison between each task (including its solution) and the textbook: the theoretical contents, the exercises recommended by the teacher, and any list of important definitions, theorems, and proofs that the teacher handed out during the course. The textbook is used as the basis for the analysis since the Swedish students – both at upper secondary school and at the university – seem to spend a large part of the time they study mathematics solving textbook exercises. This is indicated by several local and unpublished surveys but also by a report published by the Swedish National Agency for Education (Johansson & Emanuelsson, 1997). The classification procedure does not take old exams into consideration, even though they often are a source of information for students. The reason is simply that it is very difficult to find out what exams were available for a particular group of students. An inclusion of the relevant previous exams in the classification procedure would result in more tasks being judged as familiar to the students. The classification results will therefore at least not exaggerate the percentage of tasks solvable through IR (RQ2). The validity of the classification tool is supported by a study in which the reasoning requirements established by the classification tool were compared to students’ actual reasoning (Boesen, Lithner, & Palm, 2005). This was done with Swedish standardized national tests at upper secondary school and the results showed that the students’ reasoning followed the reasoning requirements established by the classification tool to a high extent. In 74% of the analyzed solution attempts, the students in fact used the type of reasoning previously determined by the classification tool. In 18% of the attempts a less creative type of reasoning was used, and 8% of the cases a type of reasoning with more CR was used. These results indicate that the classification tool, in spite of the simplifications discussed above, point at what type of reasoning the students in fact have to perform to solve the task. The classification procedure starts with the manufacturing of a correct and realistic solution of the task. The task and the solution are then compared to the course textbook and any occurrences of the task are identified. An occurrence is either an example or a recommended exercise with the same solution as the task, or a part of the theory text connected to the same solution as the task, that also shares surface properties with the task to an extent that makes it possible for the students to identify the correct solution. In the case where the handout also contained extra exercises and a collection

356

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

of mathematical formulas, these are also searched for occurrences. From the definition, it follows that an occurrence is a previous encounter with the solution in a setting similar to the one in the exam task. The similarity between the task and each possible occurrence is determined by the use of task variables. The task variables used in this analysis are classificatory and describe the distinguishing features, e.g. superficial properties, of the task in a systematic way (presented in Step 1 below). The task variables are almost the same as in the original classification tool, only grouped a bit differently. The variable response format was excluded since all exam tasks in this study have the same format as the exercises in the textbooks. After the occurrences are identified it is possible to argue if the task and a solution are familiar to the students or not. When a task is not familiar to the students and neither MR nor AR is judged to be a realistic alternative, a realistic CR solution is considered. Consequently a task is classified as demanding CR only if the task was not possible to solve using IR and a reasonable CR solution exists. The task is classified as local creative reasoning (LCR) or global creative reasoning (GCR) depending on range. A task was judged to be familiar to the students if it: • asked the student to state a fact or a theory item, e.g. a definition or a proof, that the students had been informed during the course might be asked for on the exam (solvable with MR) or • the textbook contained at least three (3) occurrences of the task, and each of these occurrences shared enough characteristics with the task to make it possible for the students to identify the applicable algorithm (solvable with AR). Many students probably need to encounter a type of task more than once before they are able to use familiar AR (or delimiting AR) to solve such a task during an exam. That the minimum number of necessary occurrences in the classification tool was set to be three is obviously a partly arbitrary choice. The choice is, however, supported by examinations of the tool’s validity presented in the previous section. Note that, because of the definition, an LCR task can be very similar to a task solvable with AR. Two tasks with identical wording from two different exams could in fact be classified differently. In that case, the tasks are admittedly identical, but the students taking the different exams have different learning experiences since they have participated in different courses. The students’ familiarity with a task depends on how many occurrences of similar tasks and solutions they have had, and these occurrences vary between textbooks. An exam task could have an AR solution for students that used one textbook, but demand an LCR solution of students that used another. Since the exam tasks were classified based on the occurrences in the particular textbook used at the course, each task was placed in only one task class. The classification of each task was documented in a spreadsheet and followed the four steps below. • Step 1: Analysis of the exam task. During the first step the task variables for the task are determined. Four task variables are used to describe each task: a solution, the context, explicit information about the situation and other key features. A solution to the task is an answer or an algorithm that solves the task, e.g. the algorithm for finding the minimum value of a second-degree function. The context of the task is the real-life situation (if there is one). The context can sometimes help the student choose the correct solution method, even though it is a surface property of the task. The context “bank deposits” may, for example, guide the students towards algorithms concerning exponential growth. Explicit information about the situation is information concerning the mathematics in the task, e.g. the function. Suppose, for example, that during a course, rational functions are only dealt with in tasks concerning limits. Then when the students encounter an exam task with a rational function, they may use delimiting algorithmic reasoning (DAR) and only choose between algorithms concerning limits when they try to solve the task. The variable other key features is used to point at key words and phrases, syntactic features, explicitly formulated hints, and any other information relevant to the comparison with the textbook occurrences. Such key features could be the word “limit,” the word “minimum” and the phrase “show that.” An unusual syntactic feature concerning how the task is presented, e.g. difficult grammar or a very long text, could make it more difficult for the students to identify the correct algorithm. • Step 2: Analysis of the textbook. During this step the textbook (and as mentioned, in one occasion also the handouts) is searched for possible occurrences of the task. The task variables from Step 1 are used to determine if a possible occurrence, e.g. an example with the same solution as the exam task, shares surface properties with the task and its solution to such an extent that it is possible to identify by the students. Two types of data were recorded. The variable

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

357

occurrences in examples and exercises is the number of occurrences of the task in examples and exercises. In case the occurrences are not identical or very similar to the task, differences and similarities are noted. Occurrences in the theory text is the number of occurrences in the theory text. These occurrences are examined in the same way as the occurrences in examples and exercises. As determined above, a task is classified as an AR task if there were at least three occurrences in the textbook that shared enough surface properties with the task. • Step 3: Argument and conclusion. The third step contains an argument for the requirement of reasoning type, and the conclusion of the classification for the task. The argumentation was based on the information collected in the two preceding steps and concerns the number of occurrences and their similarities with the exam task. • Step 4: Comment. As a last step, each task presented during the qualitative analysis was commented upon. The comments concern specific phenomena connected to the task, or the type of task. 5.2.2. Analysis of task solutions The purpose of the analysis of the task solutions is to answer the first research question (RQ1): In what ways can students solve exam tasks using imitative and creative reasoning? Within each task class, the solutions of the tasks will be compared according to length, complexity, mathematical contents, and, if possible, estimated degree of difficulty. The goal is to create task subcategories related to each task class, to enable relatively short descriptions of the different types of task solutions. These subcategories are not intended to be determined according to the research framework (Lithner, in press), but rather to reflect other properties of the tasks and their solutions. The qualitative analysis enables a description of the different ways of solving tasks open to the students. 5.3. Presentation of the results 5.3.1. Qualitative results The qualitative result consists of descriptions of (i) the tasks included in each task class and (ii) the task solutions included in the subcategories of the task classes. Every task class and most of the subcategories are exemplified with authentic tasks and the information from the classification of that task. The presentation for each task follows the outline below. 1. Analysis of the exam task a. A solution b. The context c. Explicit information about the situation d. Other key features 2. Analysis of the textbook occurrences a. Occurrences in examples and exercises b. Occurrences in the theory text 3. Argumentation and conclusion a. Argument for the requirement of reasoning b. Conclusion on reasoning demand 4. Comment 5.3.2. Quantitative results The quantitative analysis aims at answering the second research question (RQ2): To what extent is it possible for the students to solve exam tasks using imitative reasoning based on surface properties of the tasks? In order to answer this question, the total percentages of tasks in the different task classes are first presented. The variation of percentage of tasks from different task classes over the different exams is also examined. This is done by presenting, for each class, the largest percentage of tasks from that class in a single exam. The same is done for the smallest percentage and the mean percentage. The possibility for a student to pass an exam, or to pass with distinction, using only imitative reasoning is also examined.

358

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

6. Results 6.1. Qualitative results—in what ways? In this section we focus on the first research question (RQ1): In what ways can students solve exam tasks using imitative and creative reasoning? Most of the tasks were easily classified using the procedure defined in the classification tool since they were either almost identical to the occurrences in the textbook or had no occurrences at all. The classification was discussed with another researcher well versed in the reasoning framework and the classification tool. In a few difficult cases of classification the discussion was extensive. 6.1.1. Tasks solvable using memorized reasoning As described in Section 5.2.1, a task was classified as a memorized reasoning task, an MR task, if it asked the student to state a fact or a theory item. A condition for this classification was that the students had been informed during the course that this task could be included in the exam. When the tasks had been classified, the solutions were compared as described in Section 5.2.2. The comparison and sorting resulted in three different subcategories of MR tasks: tasks with the solutions being definitions; theorems; and proofs. The tasks in the first two subcategories were not similar in wording to any tasks from other classes. The tasks in the third subcategory, tasks requesting the students to prove a theorem, were formulated in ways similar to some tasks classified as algorithmic reasoning (AR) as well as to some global creative reasoning (GCR) tasks. It was, however, obvious from the teacher-made handout whether a task could have been anticipated or not by the students. An anticipated task is possible to solve by MR and is thus classified as such. In this study the students received such handouts during all courses except one. In the course without handouts, the students were informed orally during class what theorems and proofs were essential. Below follows an example of a task from the first subcategory of MR tasks. Example of the classification of an MR task: Definitions. 1. Analysis of the exam task: “What does it mean for a function to be differentiable at a point x0 ?” a. A solution Definition: A function f is said to be differentiable at a point x0 if the limit f (x0 + h) − f (x0 ) exists. h→0 h

f (x0 ) = lim

b. The context None c. Explicit information about the situation None d. Other key features Use of the word “differentiable” 2. Analysis of the textbook occurrences a. Occurrences in examples and exercises Five examples use the definition to calculate derivatives. In two of the exercises, the student is supposed to differentiate functions using the definition. b. Occurrences in the theory text The definition is explicitly stated in the textbook used during the course. 3. Argumentation and conclusion a. Argument for the requirement of reasoning The definition is a part of what the students were expected to learn. The act of stating, and even applying, the definition also occurs seven times in the textbook examples and/or exercises. The task is familiar to the student and can be solved by memorizing the answer. b. Conclusion on reasoning demand Memorized reasoning (MR)

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

359

4. Comment There are a lot of definitions in the textbook and the students were usually informed that any of them might be asked for during the exam. The definition presented in this specific example is very central to the theory of the course, and also occurs in several examples and exercises. If a student decides to use memorizing of definitions as a strategy, it is reasonable to believe that this definition would be one of the first he/she would choose to learn. 6.1.2. Tasks solvable using algorithmic reasoning The analysis of the solutions to the broad variety of tasks in this task class resulted in four major subcategories: tasks with solutions consisting of basic algorithms; complex algorithms; choice-dependent algorithms; and proving algorithms. Each subcategory is described and exemplified below. The most obvious subcategory is the group of tasks that can be solved by using what is here called basic algorithms. The basic algorithms are very direct, uncomplicated, and usually short algorithms. They are also often used as subalgorithms of longer solutions of other tasks. Some typical members of this subcategory are the differentiation algorithm, the algorithm for solving a second-degree equation, and the algorithms for solving differential equations of standard types. The classification of the tasks in this subcategory was mostly straightforward and unambiguous. Example of the classification of an AR task: Basic Algorithms 1. Analysis of the exam task “Differentiate f(x) = (x2 + x)/(x2 − 1).” a. A solution Since the function is a rational function, use the quotient rule and differentiate: f  (x) = =

(2x + 1)(x2 − 1) − (x2 + x)2x (x2 − 1) −x2 − 2x − 1 (x2 − 1)

2

=

2

−(x + 1)2 (x2 − 1)

2

=

2x3 − 2x + x2 − 1 − 2x3 − 2x2 

x+1 =− 2 x −1

2

(x2 − 1)

2

b. The context None c. Explicit information about the situation The function is f(x)=(x2 + x)/(x2 − 1). d. Other key features None 2. Analysis of the textbook occurrences a. Occurrences in examples and exercises Three examples and five exercises in the textbook apply the rule to similar functions. In at least three of those, the functions are very similar to the one in the considered task. b. Occurrences in the theory text No explicit occurrences. 3. Argumentation and conclusion a. Argument for the requirement of reasoning There are several occurrences of the algorithm in the textbook. b. Conclusion on reasoning demand Algorithmic reasoning (AR) 4. Comment Basic algorithms often handle calculations that are subalgorithms in more extensive tasks. An example is the more complex task of drawing the graph of a function, where differentiating the function is a subtask solved by a basic algorithm. Most tasks solvable by basic algorithms are therefore even more familiar than they seem at first, since they quite often occur as subtasks in later chapters of the textbook. The tasks solvable with complex algorithms have solutions that are longer and more complicated than the basic algorithms. Each solution is based on a global algorithm familiar to the students, but contains also one or several

360

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

subalgorithms. The separate subalgorithms are trivial for the students, or familiar in themselves. A subalgorithm to a complex algorithm is typically a basic algorithm. Some of the tasks solvable with complex algorithms can be difficult to immediately separate from tasks solvable with local creative reasoning (LCR) as discussed in Section 5.2.1. Using the classification tool it was however quite obvious how to separate between the LCR tasks and AR tasks solvable with complex algorithms: if all subalgorithms were familiar to the students, the task was AR, otherwise LCR. A typical example of a task solvable with a complex algorithm is a task asking the student to draw the graph of a function. In that specific example, one subalgorithm is the differentiation algorithm. Other examples are the algorithm for finding the equation for the tangent to a curve defined by an implicit equation and the algorithm for evaluation of an integral having a rational function as integrand. Example of the classification of an AR task: Complex Algorithms 1. Analysis of the exam task “Determine the domain, possible asymptotes, local and global extreme values, and draw the graph of the function f(x) = x/(x2 − 1).” a. A solution Differentiate the function twice. Determine where the function and its derivatives are defined. The domain of the function is where it is defined. Determine where the function and the function’s derivatives are zero. Make a chart over the values of the function and its derivatives (positive or negative, increasing or decreasing, concave up or concave down). Determine the function limits for the points where the function, or any of its derivatives, is not defined. Determine the function limits when x approaches the positive and negative infinity. Determine if there are any oblique asymptotes. Draw the graph. b. The context None c. Explicit information about the situation The function is f(x) = x/(x2 − 1) d. Other key features None 2. Analysis of the textbook occurrences a. Occurrences in examples and exercises The algorithm that solves this task can be used to solve four examples and 17 exercises in the textbook. b. Occurrences in the theory text The algorithm is explicitly stated on a page in the textbook, in the chapter called “Sketching the graph of a function.” 3. Argumentation and conclusion a. Argument for the requirement of reasoning The global algorithm is familiar since it occurs in so many examples and exercises. Each substep is either familiar (e.g. differentiating the function) or trivial (e.g. the function has no oblique asymptotes), which results in the whole algorithm being familiar, even though it is complex. b. Conclusion on reasoning demand Algorithmic reasoning (AR) 4. Comment Since this task consists of several different assignments that are all stated separately, the algorithm in this specific case need not be totally global in the sense that the student must have memorized all of it in advance. Each step can be considered to be a separate algorithm, except for the drawing of the graph. This implies that the student does not need to remember the full global algorithm to solve the task, which simplifies the procedure further. For most AR tasks the global algorithm is not declared in the task and it is crucial that the students have had the opportunity to memorize it. Choice-dependent algorithms can be seen as complex algorithms but of a specific type. They are specific in that some of the steps in the global algorithm consist of a choice concerning how to carry on. Each choice between different subalgorithms is possible to do algorithmically and based on surface properties of the tasks. The choices do not require creative reasoning to be carried out. The different subalgorithms that the students choose between are familiar to them,

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

361

as is the situation where the choice is made, but the order in which to apply the subalgorithms is not known to them in advance. It is important to note that a task that in theory is possible to solve with this type of reasoning might not be in practice. This happens when the number of subalgorithms and choices is too large. Then it is not realistic to think that the students may choose the correct subalgorithms and implement them in a reasonable amount of time. In that case, a realistic reasoning type is instead based on the intrinsic mathematical properties of the task which rules out algorithmic reasoning. The students’ possibilities to choose suitable subalgorithms based on the tasks’ surface properties were considered during the analysis. The number of possible subalgorithms was also determined and taken into account. This minimizes the risk for misclassification of this kind of tasks. An example of a choice-dependent algorithm is the algorithm that is used to determine limits of different algebraic expressions. To evaluate the limit it is necessary for the student to determine what kind of expression it is, what method to choose, and maybe, if the choice turned out to be the wrong one, to choose again. Example of the classification of an AR task: Choice-dependent Algorithms 1. Analysis of the exam task   √ “Determine an antiderivative of ln 1 + 1 + x .” a. A solution Chose the method of substitution to simplify the argument of ln. The resulting expression is a product, so perform partial integration and polynomial division to simplify even more. The result is possible to integrate as it is, so integrate it. Replace the substituted variables afterwards. b. The context None c. Explicit informationabout√the situation  The function is ln 1 + 1 + x d. Other key features None 2. Analysis of the textbook occurrences a. Occurrences in examples and exercises The algorithm for the method of substitution occurs in six examples and seven exercises, and among those occurrences there is at least three that consider the square root (with very similar substitutions). Partial integration occurs in three examples and three exercises. Three of these occurrences treat the natural logarithm. Two examples and two exercises use division of polynomials in the same situation as this (rational function). b. Occurrences in the theory text The only explicit algorithmic hint in the theory text is a suggestion to substitute with an appropriate function (the one used in the solution of this example is mentioned) when the integrand contains a square root. 3. Argumentation and conclusion a. Argument for the requirement of reasoning The global algorithm, including the different choices, and the subalgorithms are familiar to the student. b. Conclusion on reasoning demand Algorithmic reasoning (AR) 4. Comment For this task to be familiar to the students, it is important that each subalgorithm is familiar to them. In this case it means, e.g. that the student must be familiar with the method of substitution, but also with using that method when the integrand contain a square root expression. The student must have a reasonable chance to recognize the situation and apply the appropriate algorithm, without having to consider the intrinsic properties of the mathematical components. This is especially important when it comes to choice-dependent algorithms, where the choice of each subalgorithm might be crucial. The last of the four subcategories consists of tasks solvable with algorithms that prove a statement, here called proving algorithms. Since the tasks request that the students show or prove something, the tasks are similar in wording to some types of MR and CR tasks. The AR tasks are however solvable by algorithms that the student has been given several opportunities to practice during the course. A typical example is a task that asks the student to show

362

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

that a given function is continuous at a specific point using the formal definition of limit. As long as the appropriate algorithm has occurred in the textbook at least three times, the task is classified as AR. This classification is correct under the condition that the task does not differ from the occurrences in any crucial way. A crucial difference could be that the solution requires the student to add a specific adjustment to the familiar algorithm or to make a particular assumption. In the specific example mentioned this could be the assumption that the given function has an upper bound in an interval (if this was not included in at least three occurrences). The similarities between tasks solvable with proving algorithms and some MR and CR tasks are discussed in Section 6.1.5. Example of the classification of an AR task: Proving Algorithms 1. Analysis of the exam √ task

d “Show that dx x2 + 4 = √ x2

x +4

using the definition of the derivative.”

a. A solution Formulate the limit that corresponds to the derivative of the given function using the definition of the derivative. Simplify if possible. In this specific case, the function contains a square root expression. When calculating the limit, it is therefore necessary to multiply both the numerator and the denominator of the fraction with the conjugate expression of the numerator. After that, it is only necessary to simplify the expression and exchange h for 0 and so calculate the limit. b. The context None c. Explicit information about the situation The definition of the derivative is supposed to be used in the solution d. Other key features None 2. Analysis of the textbook occurrences a. Occurrences in examples and exercises The global algorithm, i.e. the basic steps in applying the definition of the derivative to differentiate a function, occurs in four examples and four exercises. The specific handling of the square root by multiplying with the conjugate of the numerator occurs in one of the examples and two of the exercises. It also occurs in connection to limits in one example and one exercise. These limits do not concern derivatives, but handling limits is a part of determining the derivative of a function using the definition of derivative. b. Occurrences in the theory text The definition is explicitly stated in the book, but no algorithm for using it to calculate derivatives is given. 3. Argumentation and conclusion a. Argument for the requirement of reasoning The global algorithm occurs quite frequently in the book and is also very central to the theory of the course. A student has several opportunities to learn the algorithm, including the handling of the square root sign (which is a familiar basic algorithm in itself). b. Conclusion on reasoning demand Algorithmic reasoning (AR) 4. Comment Task formulations using the terms “show” or “prove” are quite common at the university courses in mathematics. The reasoning needed to solve the tasks is often memorized reasoning (as seen in the analysis of the MR tasks), algorithmic reasoning (as seen here), or global creative reasoning (as seen in the analysis of the GCR tasks). 6.1.3. Local creative reasoning All tasks requiring local creative reasoning (LCR) were solvable with familiar global algorithms, where all steps except one were familiar to the students. It was not possible to further sort the tasks into different subcategories based on their solutions using the method of analysis described in Section 5.2.2. Example of the classification of an LCR task

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

363

1. Analysis of the exam task “A 2 m long steel wire is cut into two pieces. One piece is used to form a circle, and the other is used to form a square. Determine the largest possible value of the combined area of the square and the circle.” a. A solution Determine a formula for the function that is to be optimized. Rearrange the formula through manipulation of the relations given in the task, so that there is only one independent variable. Simplify the formula, if possible, and differentiate it. Set the derivative equal to zero and solve the equation to find possible local extreme values. Determine the value of the function in the endpoints of the domain of the function. Choose the value that optimizes the function according to the question in the task. b. The context Cutting a steel wire to make geometric figures c. Explicit information about the situation None d. Other key features The formulation “largest possible value” 2. Analysis of the textbook occurrences a. Occurrences in examples and exercises The global algorithm occurs in five examples and four recommended exercises. Neither the context nor any similar mathematical modelling exists in any of the examples or exercises. All of them use words related to extreme values, e.g. “largest,” “smallest,” or “longest.” b. Occurrences in the theory text The algorithm is explicitly stated in the book as a “Procedure for solving extreme value problems.” 3. Argumentation and conclusion a. Argument for the requirement of reasoning The global algorithm is familiar to the students. It occurs more than three times in the textbook, and the key formulation “largest possible value” makes it possible for the students to identify the correct algorithm. The algorithmic steps that include algebraic manipulations (differentiating, solving an equation and so on) are trivial or solvable with subalgorithms familiar to the students. The only steps that are not are the two first steps of modelling: formulating the function from the described situation and reformulating it to depend on one variable only. The context with the steel wire is new to the students – it does not occur in the textbook – and it is not possible to determine that the students can use AR to carry out these steps. However, the part that demands CR is, even though crucial, not the predominant part of a correct solution and therefore is judged to be local. b. Conclusion on reasoning demand Local creative reasoning (LCR) 4. Comment Note that even though the part that demands CR is critical for the solution, since we consider a correct, and therefore complete, solution, the CR part is not dominant. This motivates that the classification is local and not global CR. 6.1.4. Global creative reasoning The typical task solvable with global creative reasoning (GCR) is totally new to the student, but does not necessarily have a complex solution or look difficult to the teacher. There is no familiar algorithm that the students can use to solve the task although a solution can be quite straightforward if it is founded on the intrinsic mathematical properties of the task components. The analysis of the GCR tasks’ solutions resulted in three task subcategories: tasks with solutions consisting of a construction of an example; the proof of something new; and modelling. The tasks with solutions in the first subcategory all request that the students construct an example with given properties, e.g. a function with certain continuity properties. It is possible to learn how to construct particular types of examples using imitative reasoning if there are a sufficient amount of occurrences in the textbook. However, no examples, theory text, or recommended exercises in any of the analyzed textbooks requested the production of examples of any kind, so all the analyzed tasks with this wording were classified as GCR tasks. An authentic example of such

364

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

a task is “Give an example of a function that is left continuous but not continuous at x = 1. Draw the graph of the function.” The recommended exercises most similar to this task requested the students to determine whether a given function has some continuity property or not. The solutions in the second subcategory, the proof of something new, are very similar to the MR task solutions in the subcategory proofs and the AR task solutions in the subcategory proving algorithms. In this case, however, the statement or theorem that the students are supposed to prove is new to them and they have not had the opportunity to practice any applicable algorithm. The third subcategory consists of solutions using mathematical modelling. The solutions do not necessarily include mathematical calculations that are completely unfamiliar to the students. In fact, several, or even all, subalgorithms might be completely familiar according to the classification tool. The part of the solution that is the actual modelling is, however, unfamiliar and not trivial, and there is no global familiar algorithm applicable. Below are two examples of classifications of tasks with the solutions in subcategories the proof of something new and modelling, respectively. Example of the classification of a GCR task: The proof of something new 1. Analysis of the exam task “Show that the equation xn + px + q = 0 does not have more than three different real roots for n odd, and no more than two different real roots for n even.” a. A solution Since the function f(x) = xn + px + q, on the left side of the equation, is continuous and differentiable (polynomial) on R, the Mean-Value Theorem guarantees that between two different real roots of the equation, there is a local extreme value of the function. The derivative of f, f (x) = nx(n − 1) + p, has at the most one real zero, p  1  n−1 , x= − n when n is even. When n is odd there is at the most two zeros, p  1  n−1 , x=± − n and no zero at all if p > 0. This implies that there can be at the most two, for n even, or three, for n odd, solutions to the equation. b. The context None c. Explicit information about the situation The equation is xn + px + q = 0 d. Other key features The formulation “Show that (. . .) does not have more than three different real roots” (my italics) implies that the problem is a problem of existence 2. Analysis of the textbook occurrences a. Occurrences in examples and exercises One example in the textbook mentions that two different real roots imply at least one local extreme point in between the roots. In one of the exercises in the textbook, the students are asked to verify that an equation has at least one root in each of three different intervals. b. Occurrences in the theory text The textbook notes that the Mean-Value Theorem is an existence theorem that treats the occurrence of e.g. roots and not their exact location. 3. Argumentation and conclusion a. Argument for the requirement of reasoning There are several clues in the task to help the students with the solution. The phrasing in the task (see other key features above) might lead the students toward using the Mean-Value Theorem. There is an example in the textbook that constructs the main argument, and there is also one exercise, which the students probably

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

365

have solved (with or without help), that focuses on that example. There are, however, no occurrences of any global algorithm that the students can use to prove the statement, and the task is therefore not familiar. Since the statement is not a theorem to which the students may have memorized a proof, global creative reasoning is required to solve this task. b. Conclusion on reasoning demand Global creative reasoning (GCR) 4. Comment There are probably very few tasks of this type that can be solved with LCR, unless they contain a hint, e.g. “use the definition of derivative to,” that immediately guides the student towards the correct global (and familiar) algorithm. Example of the classification of a GCR task: Modelling 1. Analysis of the exam task “If an island without predators has enough food for only L rabbits, it is possible to describe how the quantity of rabbits, y, changes by using the logistic equation y = ky(1 − y/L), k,L > 0. At what number of rabbits does the population grow at the highest speed?” a. A solution Since the question concerns when something “grows at the highest speed,” it is reasonable to use the second derivative with respect to time. Since the logistic equation already defines the value of the derivative, differentiate the both sides of the equation once. This results in y = ky (1 − (2/L)y) Putting the second derivative equal to zero, we see by solving the equation that the fastest growth occurs either when y = 0 or when y = L/2. If y = 0, then y = 0 or y = L. Checking the value of y for these three different numbers – the last two coinciding with the endpoints of the domain of y – of rabbits, we see that L/2 gives us a maximum. b. The context Rabbits on a small island with no predators c. Explicit information about the situation The relation y = ky(1 − y/L), k,L > 0 d. Other key features The wording “grow at the highest speed” 2. Analysis of the textbook occurrences a. Occurrences in examples and exercises None b. Occurrences in the theory text The textbook uses the same scenario (rabbits on an island without predators) in connection to logistic growth, but no algorithm for any similar situation is presented. 3. Argumentation and conclusion a. Argument for the requirement of reasoning Even though the scenario can be directly linked to logistic growth, the students taking this particular course have had no opportunity to practice any algorithm that can be used to solve this task. The equation being a differential equation does not hint directly at using the regular extreme value problem algorithm, and even if that algorithm is chosen, it is necessary to adjust it to the situation with methods based on the intrinsic properties of the mathematical components of the task. b. Conclusion on reasoning demand Global creative reasoning (GCR) 4. Comment The different parts of this solution, differentiating, solving the equation, testing the different candidates to the solution, and so on, were in fact familiar to the students. The type of task could have been a typical AR task if the students had had the opportunity to practice a number of similar exercises, especially if the exercises all had the same context, but that was not the case during this course. The modelling that the students have to do in order to solve this task does not concern creating a mathematical scenario responding to the real world context (the scenario is already a part of what is given in the task). The act of modelling is instead carried out in the

366

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

interpretation of the question, when identifying how a solution might be connected to the presented mathematical scenario.

6.1.5. Summary of the qualitative results The first research question (RQ1) asked: In what ways can students solve exam tasks using imitative and creative reasoning? The results will be presented for each of the task classes. The tasks that are possible to solve with memorized reasoning, the MR tasks, form a quite homogeneous class, and are easy to recognize. Each task belongs to one of three subcategories: tasks with the solutions being definitions; theorems; or proofs. By memorizing definitions, theorems, and proofs that are part of the course, the students may solve all different types of MR tasks in the exams. Most of the MR tasks are not even realistically possible for the students to solve using some kind of creative reasoning (CR). There may be exceptions for some of the easier proofs where local creative reasoning could be used to complete e.g. forgotten parts. Since the second research question considers to what extent it is possible to solve the exam tasks using imitative reasoning (IR), these tasks are all classified as MR. The tasks possible to solve using algorithmic reasoning, the AR tasks, are more heterogeneous than the MR tasks. It is possible to distinguish four main subcategories of solutions to AR tasks. These four subcategories contain tasks with solutions consisting of basic algorithms, complex algorithms, choice-dependent algorithms, and proving algorithms. Basic algorithms are fairly straightforward and quite often short. The tasks solvable with more complicated or longer global algorithms are sorted into the subcategory of complex algorithms. It is quite common that a substep of a complex algorithm consists of a basic algorithm. The choice-dependent algorithms are similar to the complex algorithms but one or several of the steps consist of important choices between different strategies and subalgorithms. Some of the tasks solvable by quite simple algorithms start with, for example, the expression “show that.” This could indicate that the students should use CR to prove a new theorem, or MR to reproduce a memorized proof. Instead, in this case, they can use AR to prove a relatively standardized statement by using an algorithm that they have encountered several times in the textbook. Tasks solvable using these algorithms are sorted into their own subcategory, proving algorithms. A student that learns to use the basic algorithms that are part of the course also has access to several of the substeps of the complex and choice-dependent algorithms. The great variety of solutions does, however, imply that plenty of time and a lot of practicing are important factors in mastering the AR tasks. This is the case especially if the intrinsic mathematical properties of the tasks are basically ignored and the appropriate algorithm is identified using only surface properties of the task. All tasks demanding local creative reasoning, the LCR tasks, turned out to have solutions based on familiar global algorithms that the students had to adjust locally. By learning to apply the algorithms presented in the examples and in the teacher-recommended exercises, the students may solve not only the AR tasks, but also large parts of many LCR tasks. The analysis finally shows that the tasks demanding global creative reasoning, the GCR tasks, are of three very different types. The three subcategories are tasks with solutions consisting of a construction of an example, the proof of something new, and modelling. Among the analyzed GCR tasks, the tasks with the shortest and less complex solutions were the tasks requiring a construction of an example. The tasks requesting the proof of something new seem to be more complex and time consuming to solve. The modelling tasks often have solutions that easily might be solved by an algorithm if it was familiar to the students. In order to solve the GCR tasks, and to complete the LCR tasks (the part that is not possible to solve using AR), the students probably need to use creative thinking processes. An interesting result of the qualitative analysis is that tasks asking the students to prove something exist in subcategories of three different classes: MR; AR; and GCR. The MR tasks of this type are connected to memorized proofs of theorems central to the course content matter. The students are informed in advance that they might be asked to prove these theorems during the exam. The AR tasks are usually connected to important definitions, and the students are given the chance to practice specific techniques for proving some types of statements. A typical example is to prove that a given function has or has not a certain property. The GCR tasks of this type instead request that the students construct short and uncomplicated – compared to most of the proofs in the MR task class – proofs of statements of a type that they have not practiced proving. Not all exams contain all three, or even any, of these types of tasks, but many exams do contain one or more.

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

367

Table 1 Number of tasks and percentage of tasks in different classes Task class

Number of tasks

Percentage of tasks (%)

Memorized reasoning Algorithmic reasoning Local creative reasoning Global creative reasoning

36 110 47 19

17 52 22 9

Imitative reasoning (MR + AR) Creative reasoning (LCR + GCR)

146 66

69 31

Total

212

100

6.2. Quantitative results—to what extent? Now we turn to the second research question (RQ2): To what extent is it possible to solve exam tasks using imitative reasoning based on surface properties of the tasks? In order to answer this question the number and percentage of tasks in the different task classes are presented in Table 1. The proportions between task classes turned out to be almost the same whether the calculations were based on task points or on number of tasks. Also, when the percentage for each task class was calculated separately for each exam, the mean values over exams were also almost identical with the numbers presented in Table 1. Similarly, when the percentage for each task class was calculated separately for each university (all tasks from that university) and the mean was calculated over the universities, the proportions varied only slightly. The analysis shows that it is possible to solve 146 of the 212 tasks using imitative reasoning (IR). This means that about 30% of the tasks require creative reasoning (CR) of some kind while about 70% do not. Also, in a little more than two thirds of the tasks requiring some kind of CR, it is needed locally and not globally. The mean number of tasks requiring CR varies from 28 to 38% when we compare universities, while the variation over exams is a lot bigger. In Table 2 the mean, minimum and, maximum percentage of task points for the different exams are displayed. It is apparent from Table 2 that the proportions vary over exams to a large extent. It is also problematic to compare the different exams in the study when it comes to their degree of difficulty. The number of tasks of different classes varies from exam to exam, and so does the mathematical contents. Even the contents of the courses differ to some extent. It turns out, however, that all exams except one were possible to pass without the use of creative reasoning of any kind. In one fourth of the cases it was also possible to pass with distinction without using any CR (neither local nor global). It was possible to pass with distinction in all cases but one (15 of 16) without the use of global creative reasoning, i.e. using only IR and LCR. 6.2.1. Summary of the quantitative results The quantitative analysis of the classification show that more than 65% of the tasks are solvable by imitative reasoning (IR), and that all exams except one are possible to pass without using creative reasoning (CR). It is thus quite possible for a Swedish student at the introductory calculus course to pass the exam using only IR based on the tasks’ surface properties. The qualitative analysis shows that the task solutions demanding local CR (LCR) mainly Table 2 Percentage of task points for different classes Task class

Mean (%)

Minimum (%)

Maximum (%)

Memorized reasoning Algorithmic reasoning Local creative reasoning Global creative reasoning

17 52 22 9

0 18 10 0

34 73 43 32

Imitative reasoning (MR + AR) Creative reasoning (LCR + GCR)

69 31

40 10

90 60

368

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

consist of a familiar global algorithm that has to be adjusted in one small step. The fact that the global algorithms in these tasks are familiar gives the students a reasonable chance to solve at least parts of the given tasks using IR. It is in fact possible for the students to completely (MR and AR tasks), or to a large extent (LCR tasks), solve 91% of the tasks without using CR since only 9% of the tasks demand global CR. In the same way, by using IR and in some cases adding a small measure of LCR, it is possible for them to pass all exams except one with distinction. This result is in agreement with earlier results concerning demands on reasoning type in calculus textbooks (Lithner, 2004) and in Swedish upper secondary school tests (Palm et al., 2005). 7. Discussion The purpose of this study was to examine the reasoning that Swedish university students in mathematics have to perform in order to solve exam tasks and pass exams. This was done by the classification of over 200 tasks from 16 introductory calculus course exams. The analysis identifies several different types of solutions of tasks from all classes, and also shows that the studied exams to a large extent consist of tasks possible for the students to solve with imitative reasoning (IR). The results also showed that almost all the exams were possible to pass using this kind of reasoning. Lithner (2004) have shown that in some cases it is possible for students to solve up to 70% of calculus textbook exercises using IR. The students usually spend a huge part of their study time working with the textbook exercises, and the author therefore argues that the design of the exercises affects the students’ resources, heuristics, control, and even belief systems.4 The students probably do not spend as much time with the exam of the course as they do with the textbook exercises, but the exams play an important role in the students’ practical and economical situation. As mentioned in the introduction, assessment in general affects how students study (Kane et al., 1999), and since the students’ skill at solving exam tasks at university level is so important to them, the designs of the exams probably influence the students’ skills and attitudes. The students could for example be given the impression that mathematics mainly consists of imitative methods. Another consequence of the present situation could be that students with the habit of primarily using IR, still pass the exams and can continue their studies with limited mathematical resources and abilities. Even though imitative reasoning is an obvious part of mathematics, just as memorizing vocabulary is a part of learning a new language, it is not the only part. The students could be given more opportunities to learn creative reasoning and thereby become familiar with the situation of solving unfamiliar tasks. Then it would be more reasonable to demand creative reasoning to a higher extent in exams, and the students could hopefully add creative thinking processes to their mathematical repertoire. 7.1. Construction of CR tasks One way of giving the students the opportunity to learn creative reasoning (CR) is simply to let them practice more on solving CR tasks. Some suggestions on how to construct CR tasks can be found in the qualitative analysis of the task solutions. All local creative reasoning (LCR) tasks turned out to have solutions based on familiar global algorithms that the students had to adjust locally. This means that one way of giving the students the opportunity to practice CR could be to slightly vary the setting in tasks that are familiar to the students, and perhaps discuss the new solutions. Among the analyzed global creative reasoning (GCR) tasks, the tasks with the shortest and less complex solutions were the tasks requiring the construction of an example. These tasks are also quite easy to formulate, and may indicate whether the students have developed an understanding for a concept or not. The tasks requesting the proof of something new seem to be more complex and time consuming to solve, but might be reasonably easy to construct. A possibility is for example to choose minor lemmas or advanced exercises from other textbooks. The modelling tasks often have solutions that easily might be solved by an algorithm if it was familiar to the students. Again, inspiration from other textbooks can be an accessible path when constructing such tasks, especially since different algorithms might be practiced in different books. 4

The terminology is taken from Schoenfeld’s framework (Schoenfeld, 1985).

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

369

8. Further research The analysis showed how the students can use imitative reasoning (IR) to solve exam tasks. It also showed that IR is an accessible path to passing exams in 15 of 16 cases. Even though it is possible to pass most of the analyzed exams using IR, it is also obvious from the analysis that there is a lot of material to handle. The different lists of definitions, theorems and proofs varies in length, but it is quite common that they contain 10–15 theory items, and sometimes each item is several lines long. Similarly there is, as seen in the qualitative analysis, a great number of different types of algorithms with varying components to master. There might be many different reasons that the students choose IR instead of different kinds of creative reasoning (CR). They might make this choice because they think that IR is easier to master, because the textbook focus on IR, or because of unspoken existing or imagined expectations from teachers or fellow students. There might of course be other reasons as well, e.g. old habit from earlier school years. Disregarding the students’ reasons, the great variety of algorithms, as well as the quite often very long lists of theory items to memorize, ensures that using primarily IR is not necessarily the same thing as taking a shortcut, at least not in university level mathematics. Further research on the students’ opinions on different types of reasoning and what they choose to practice is important and interesting. A lot of the tasks in the studied exams have complicated solutions and are founded on quite intricate mathematical theories. However, the classification in this study shows that a lot of those tasks are still solvable by imitative reasoning (IR) that is not founded on the intrinsic mathematical properties of the tasks. There is a possibility that the teachers’ opinions of what is being tested is based on a false impression of what is required to solve these IR tasks, and in that case there may be a gap between what the teachers believe is tested and what is actually tested. Examining the teachers’ opinions on what knowledge and competences they aim to test may also shed light on why the exams are designed the way they are, with respect to the required reasoning. References Bergqvist, T., Lithner, J., & Sumpter, L. (2003). Reasoning characteristics in upper secondary school students’ task solving. Department of Mathematics and Mathematical Statistics, Ume˚a University, Research Reports in Mathematics Education (1). Boesen, J., Lithner, J., & Palm, T. (2005). The relation between test task requirements and the reasoning used by students. Department of Mathematics and Mathematical Statistics, Ume˚a University, Research Reports in Mathematics Education (4). Brayer Ebby, C. (2005). The powers and pitfalls of algorithmic knowledge: a case study. The Journal of Mathematical Behavior, 24, 73–87. Flemming, M., & Chambers, B. (1983). Teacher-made tests: Windows on the classroom. In W. Hathaway (Ed.), Testing in the schools: New directions for testing and measurement. San Francisco: Jossey-Bass. Haylock, D. (1997). Recognising mathematical creativity in schoolchildren. Zentralblatt fuer Didaktik der Mathematik, 29(3), 68–74. Hegarty, M., Mayer, R., & Monk, C. (1995). Comprehension of arithmetic word problems: A comparison of successful and unsuccessful problem solvers. Journal of Educational Psychology, 87(1), 18–32. Hiebert, J. (2003). What research says about the NCTM standards. In J. Kilpatrick, G. Martin, & D. Schifter (Eds.), A research companion to principles and standards for school mathematics (pp. 5–26). Reston, VA: National Council of Teachers of Mathematics. Hiebert, J., & Carpenter, T. (1992). Learning and teaching with understanding. In D. Grouws (Ed.), Handbook for research on mathematics teaching and learning (pp. 65–97). New York: MacMillan. Johansson, B., & Emanuelsson, J. (1997). Evaluation in science and mathematics; elementary school teachers narrate (In Swedish: Utv¨ardering i naturkunskap och matematik: l¨arare i grundskolan ber¨attar). The Swedish National Agency for Education. Stockholm: Liber Distribution. Kamii, C., & Dominick, A. (1997). To teach or not to teach algorithms. Journal of Mathematical Behavior, 16, 51–61. Kane, M., Crooks, T., & Cohen, A. (1999). Validating measures of performance. Educational Measurement: Issues and Practice, 18(2), 5–17. Leinwand, S. (1994). It’s time to abandon computational algorithms. Education Week, 9, 36. Lithner, J. (2000). Mathematical reasoning in task solving. Educational Studies in Mathematics, 41, 165–190. Lithner, J. (2003). Students’ mathematical reasoning in university textbook exercises. Educational Studies in Mathematics, 52, 29–55. Lithner, J. (2004). Mathematical reasoning in calculus textbook exercises. Journal of Mathematical Behavior, 23, 405–427. Lithner, J. (in press). A research framework for creative and imitative reasoning. Educational Studies in Mathematics. McNeal, B. (1995). Learning not to think in a textbook-based mathematics class. Journal of Mathematical Behavior, 14, 18–32. NCTM. (2000). Principles and standards for school mathematics. Reston, VA: National Council of Teachers of Mathematics. Niss, M., & Jensen, T. (Eds.). (2002). Competencies and the learning of mathematics (In Danish: Kompetencer og matematiklæring; Uddannelsestyrelsens temahaefteserie nr. 18). Undervisningsministeriet. Palm, T., (2002). The realism of mathematical school tasks—Features and consequences. Unpublished doctoral dissertation, Ume˚a University. Palm, T., Boesen, J., & Lithner, J. (2005). The requirements of mathematical reasoning in upper secondary level assessments. Department of Mathematics and Mathematical Statistics, Ume˚a University, Research Reports in Mathematics Education (5). Pesek, D., & Kirshner, D. (2000). Interference of instrumental instruction in subsequent relational learning. Journal for Research in Mathematics Education, 31, 524–540.

370

E. Bergqvist / Journal of Mathematical Behavior 26 (2007) 348–370

P´olya, G. (1954). Mathematics and plausible reasoning Princeton NJ: Princeton University Press. Schoenfeld, A. H. (1985). Mathematical problem solving. Orlando, FL: Academic Press. Senk, S. L., Beckmann, C. E., & Thompson, D. R. (1997). Assessment and grading in high school mathematics classrooms. Journal for Research in Mathematics Education, 28, 187–215. Silver, E. (1997). Fostering creativity through instruction rich in mathematical problem solving and problem posing. Zentralblatt fuer Didaktik der Mathematik, 29(3), 75–80. Skemp, R. (1978). Relational understanding and instrumental understanding. Arithmetic Teacher, 26(3), 9–15. Tall, D. (1996). Functions and calculus. In A. Bishop, K. Clements, C. Keitel, J. Kilpatrick, & C. Laborde (Eds.), International handbook of mathematics education (pp. 289–325). Dordrecht: Kluwer.