Journal of Symbolic Computation 61–62 (2014) 100–115
Contents lists available at ScienceDirect
Journal of Symbolic Computation www.elsevier.com/locate/jsc
A tool for evaluating solution economy of algebraic transformations ✩ Rein Prank Liivi Str 2, Institute of Computer Science, University of Tartu, 50409 Tartu, Estonia
a r t i c l e
i n f o
Article history: Received 9 April 2013 Accepted 14 June 2013 Available online 18 October 2013 Keywords: Algebraic transformations Disjunctive normal form Exercise environment Solution steps Solution economy
a b s t r a c t In this paper we consider student solutions to tasks on the conversion of propositional formulas to disjunctive and conjunctive normal forms. In our department, students solve such exercises using a computerized environment that requires correction of every direct mistake but does not evaluate suitability of the steps. The paper describes implementation of an additional tool for analyzing these steps. This tool compares the students’ steps with an “official algorithm” and possible simplification operations and checks for 20 deviations from the algorithm. The tool is applied to solutions from two student sessions and the paper analyzes the data on algorithmic mistakes. © 2013 Elsevier B.V. All rights reserved.
1. Introduction In this paper we consider solutions to algebraic tasks where the student transforms an expression, step by step, to some required form. Our concrete topic will be transformation of formulas of propositional logic to normal form but in its essence the topic of the paper is elementary algebra. Many algorithmic tasks from many school algebra and university calculus topics (operations with fractions, operations with polynomials, differentiation, and integration, but also solution of equations and equation systems) follow the same solution pattern and the same problems arise. When a teacher checks a student’s written solutions to an expression transformation task, three questions are the most important:
✩
Current research is supported by Targeted Financing grant SF0180008s12 of the Estonian Ministry of Education. E-mail address:
[email protected]. URL: http://vvv.cs.ut.ee/~prank/.
0747-7171/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.jsc.2013.10.014
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
101
(1) Are the steps “correct”? (2) Are the steps reasonable? (3) Has the required final state been reached? The first question is mostly meant to check whether the result of each conversion is equivalent to the previous expression. The second question should help to clarify whether the steps follow the algorithm of the actual task type (if such exists) or “brings us closer to the answer.” In case of an unfinished solution, the third question can be modified to: “What stage is reached in the solution?”. Algebraic transformations often contain many lines of symbols and each line contains many details. Checking them is very labor intensive, and computers seem to be better suited than human teachers for checking at least the most formal aspects. In addition, a computer can require correction of mistakes immediately and this forces the student to correct his/her misunderstandings of the subject before making the same mistake next time. Progress in the creation of computerized exercise environments for algebraic transformations has been slow. There are currently many small software pieces in existence for narrow subject scopes (mostly for elementary technical exercises). Three bigger programs – MathXpert (MathXpert, no date; Beeson, 1998), APLUSIX (Aplusix, no date; Nicaud et al., 2004) and T-algebra (T-algebra, 2007; Issakova et al., 2006; Prank et al., 2007) – cover significant parts of school or university algebra and calculus. The solution environment that is the object of this article is designed for expression transformation exercises in propositional and predicate logic (Prank and Vaiksaar, 2003). Solution step dialogs of existing environments allow us to speak about input-based and rule-based systems for algebraic conversions. This division is similar to the White Box and Black Box stages of teaching in the paper of B. Buchberger (1990). In input-based systems the result of every step is entered by the student. For example, in APLUSIX the student can create a copy of the previous expression/equation/system and then edit it until the desired result of the step is reached. In some other systems the user marks a subexpression and enters an expression that will replace the marked part. In input-based systems algorithmic decisions and technical work are both performed by the student. For a step in the MathXpert rule-based system, the student marks a subexpression of the existing expression and selects a conversion rule from the menu. The program performs the conversion automatically (if the rule is applicable to the selected subexpression). Using Computer Algebra Systems commands for performing conversion steps can also be considered as work in a rule-based system. In rule-based systems the student makes only the algorithmic decisions. The technical correctness of steps is the program’s responsibility. The main benefit of using an input-based dialog is automation of checks required for answering Question 1 above and the possibility for quick feedback. Programs like APLUSIX or T-algebra and many others check equivalence with the previous line immediately as an expression is entered. The student should correct the errors before the next step. An input-based dialog is quite natural for initial training of new mathematical techniques such as arithmetic operations in elementary grades, for collection of similar terms in middle grades, or for differentiation of expressions in secondary school. The input-based working mode seems also be unavoidable in testing and assessment of technical skills. The existing exercise environments that check equivalence cover rational expressions or propositional logic topics, but not trigonometry or predicate logic. To answer Question 1, a program should be able to verify equivalence of expressions that are allowed in the actual task type. In most implementations free input allows the user to choose an arbitrary length for conversion steps. This can be beneficial when students have different skill levels and some of them are able to create longer steps. Long steps, however, make it harder to recognize the reasons for nonequivalence (to give helpful feedback on Question 1) and to computerize the checks required to answer Question 2. If a step is made in the rule-based mode then correctness is guaranteed and Question 1 becomes irrelevant. A rule-based dialog enables considerably quicker conversions to be made than direct input. It allows the user to ignore low-level details and to concentrate on learning the textbook algorithm (for example, for linear equations or systems) or on developing the learner’s own manner of solution by solving more original tasks. However, it also makes it possible for the students to find the solution comparatively quickly through trial and error (instead of thinking about how to solve the task) or
102
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
to compose (quickly) very long solutions containing many aimless steps. The checks for Question 3 do not depend on the working mode for making the steps. The requirements for the final answer are usually formulated in formal syntactical terms that can be easily verified by a program. Usually the solution window of exercise environments contains a special button for presenting the result of a finished step as the final answer. After this button is pressed the program performs the checks to answer Question 3. Computerized environments for transformations do not usually monitor whether steps of a solution are expedient or not (with respect to the actual task type). There is one commonly known algebra environment, APLUSIX, where the authors try to give some feedback about the student’s progress in solving the task. APLUSIX uses for this purpose some indicators of general properties of expressions that have an important role for many problem types (Nicaud et al., 2004): factored, expanded, reduced, sorted. The program displays (on the window status bar) the ratio of what part of the goal has already been reached and what part remains. For some task types the authors of APLUSIX have also implemented a measure of the integral property solved. The publications do not describe in detail how these ratios are calculated, but their changes give the student some feedback about the progress. It is desirable to have similar features in other programs and also to have programs that compile and express more explicit judgments about the suitability of the steps. This paper describes our attempt to construct a program that evaluates the expediency of steps in solutions to algorithmically nontrivial tasks. Our starting point is a solution environment for algebraic transformations in propositional logic. This main program guarantees that solution steps preserve equivalence with the initial formula; however, it does not evaluate whether a step is reasonable or not. Our project creates an additional tool for analyzing expediency of steps in recorded solutions. The tool annotates solution steps and seeks 20 types of deviations from the algorithm. It collects statistics for each solution, the entire solution file, and a group of students. We discuss our analyzer’s design, present results of an analysis of solutions received from two groups of students (162 and 43 students) and discuss how such data on mistakes can be used. To the knowledge of the author, this work is the first trial on computerization of explicitly formulated automated analysis of solution step expediency and also the first tool for collection of corresponding data. Section 2 of the paper provides a general description of the environment where our students solve the tasks. Section 3 characterizes the educational situation that caused us to launch the analyzer project. Sections 4–6 describe the implementation details for conversion rules of the main program, the normal form algorithm as we teach it to our students, and the information that is recorded in solution files. Section 7 describes the design of our additional tool for analyzing the solutions in files, which starts with a provisional list of deviations from the algorithm and describes the error types that were added after investigation of the analysis files from the first version. Section 8 analyzes the results of application of the final version to solutions of two student groups. Section 9 investigates the correlations between the number of steps and numbers of different types of mistakes. Section 10 summarizes the results of the project and discusses some further work. 2. History and general features of our formula transformation environment Students in our department have solved most of the exercises for the third-term course, Introduction to Mathematical Logic, on computers since 1991 (Prank, 1991, 2006). Our program package contains exercise environments for truth-table exercises, propositional and predicate formula transformation, evaluation of predicate formulas on finite models, formal proofs in propositional calculus and predicate calculus, and for Turing Machines. While working with our programs, the student enters the solution step by step. The program checks the correctness of each step and completion of the task. Our first Formula Manipulation Assistant was implemented in MS DOS in 1989–1991 by H. ViiraTamm for expression of propositional formulas using {&, ¬}, {∨, ¬} or {⊃, ¬} and disjunctive normal form (Prank and Viira, 1991). (Note that in our course we use &, ⊃ and ∼ for conjunction, implication and biconditional.) Each conversion step consisted of two substeps. At the first substep the student marked a subformula to be changed. For the second substep the program had different modes. In Input mode the student entered a subformula that replaced the marked part. In Rule mode the student selected a conversion rule from the menu and the program applied it.
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
103
Fig. 1. Solution window of the main program. Last three rules can be applied to conjunctions and disjunctions. The student has performed three steps and marked a subformula for moving the negation inside.
At the first substep the program checked that the marked part was a syntactically correct proper subformula (including checking the order of operations). At the second substep in Input mode the program checked syntactical correctness of entered subformula and equivalence issues (the entered formula should be equivalent with the marked part and enclosed in brackets if necessary). In Rule mode the program checked whether the selected rule is applicable to the marked part. In case of an error the program issued a corresponding message and required correction. The program counted errors in syntax, order of operations, equivalence, misapplications of rules, and presentation of formula as the final answer. After a few years of using the program our use of modes stabilized. Exercises on expression of formulas using given connectives were solved in Input mode and exercises on normal form in Rule mode. In the first case students should learn concrete equivalencies, and in the second case the conversion algorithm. During our first software project, our approach to the design of exercise software was iterative. We started with minimal input from the students’ side and with minimal intervention from the programs’ side. If necessary, we added details to the input and checking procedures. In case of algebraic conversions we started in 1989 with an environment where the student simply entered the next line (with a possibility to copy parts of the previous formula) and the program checked the equivalence with the previous line. We saw that with such an interface the students who misinterpreted the order of operations did not understand the reasons for messages about nonequivalence. Adding the marking substep made the conversion mechanism explicit and after that we had no reason for further changes. Algebraic formula manipulation in propositional logic was a comparatively easy part of the course of logic. Exercises on expression through given connectives and on normal form have textbook algorithms and only the students who had missed exercise labs or homework sometimes had difficulties with tests. In 2003 the program was rewritten in Java but we did not change the principles of the solution step dialog and did not add more support for students. We only extended the application area by task types for conjunctive normal form and for predicate logic (Prank and Vaiksaar, 2003). Fig. 1 demonstrates a disjunctive normal form exercise in Rule mode. The upper panel contains counters of direct errors, formulation of the task (‘Transform to FDNF’) and instruction (‘Mark
104
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
a subformula and apply appropriate conversion’). The counters of direct errors show one error for marking a syntactically incorrect word and two errors for selection of inapplicable rules. The right column contains counters for errors in understanding the order of operations and in giving answers. The solution window already contains five solution steps: (1) (2) (3) (4) (5)
elimination of (negated) implication (rule 12); elimination of biconditional (rule 15); removing brackets (rule 1); absorption of conjunction (rule 25); multiplication (distributive law 21).
For the next step, the student has marked the disjunction in brackets. 3. Problems with first-term students The current research was initiated when we used our formula transformation environment with another category of students and experienced serious difficulties. Two changes occurred after 2003: the number of students admitted to computer science increased year by year and the knowledge and skill level of weaker students is now lower than before; and secondly, some years ago we started teaching the introductory part of propositional logic in the first-term course, Elements of Discrete Mathematics, to allow better preparation for database and programming courses. We then saw that apart from students who solved our computerized exercises very quickly, there were others who were in real trouble. The character of their troubles was different for different types of exercises. In the first two computer labs (truth-table exercises and expression of formulas using given connectives) the difficulties were caused mainly by the novelty of the material. The students had to execute new operations that had new “multiplication tables,” take into account their order of priority and use about 20 different new equivalencies in conversion steps. Correspondingly, the students made frequent direct errors: the wrong truth-value, wrong order of operations, or nonequivalence with the previous line. However, the solution principles of the task types were very straightforward and the error messages were understandable. With gradually increasing understanding of the necessary facts the messages disappeared and the students solved the exercises. Of course, there were also students who regularly used lecture notes and did not strive to memorize the equivalencies. The observable outcome of such behavior was that lab exercises and homework were solved but the students were unable to solve a corresponding task during the test. Nevertheless, we did not consider such cases as signs of serious deficiencies in the work of instructors or in our exercise environments. The case of normal form exercises is more complex. For computer science students the normal form exercises are a good example of processing the data by means of an algorithm containing several stages of different character. The student should successively express implication and biconditional through negation, conjunction and disjunction, use De Morgan’s laws for pushing negations down to variables, expand the formula using distributive law, leave out false conjunctions and redundant copies of variables, etc., until obtaining the normal form. In normal form exercises it is crucial first to determine what stage of the algorithm should be executed and only after that to apply appropriate conversion. Execution of the right steps in the wrong order can increase the length of solution by a considerable degree; however, students of the computer age prefer pushing buttons to thinking: The solution files of many students appeared to be 3–4 or more times longer than necessary. Upon first reaction, we excluded the conjunctive normal form from the syllabus of the first-term course because the choice between two reciprocal uses of distributive law obviously confused weaker students. We also tried to speak more about the need to follow the algorithm, but this did not solve the whole problem. In our final test, one of the tasks was always transformation of a formula to full DNF. The formula was generated randomly and contained 3 variables and one conjunction, one
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
105
disjunction, one implication and one biconditional. In autumn term 2011 the two longest (successful!) transformations had lengths of 269 and 263 steps. Many transformations were between 50 and 80 steps, instead of the normal 15–25 steps at most. If we want to assess students’ learning of an algorithm properly then we must be able to understand individual solutions and their conformity to the algorithm. Grading solutions to the test proved that we are not able to analyze such volumes of computer-created material by hand. If we want to improve instruction or our exercise environment then we must be able to find out what the students’ most typical algorithmic faults are and what causes such remarkable lengthening of solutions. To achieve this, the author of this paper decided to program an additional software tool. Before description of the tool, we will describe the necessary properties of the main program and also the object of teaching – the normal form algorithm. 4. Some implementation details of conversion rules The current version of our formula transformation environment was written in Java by V. Vaiksaar as his bachelor thesis in 2003 (Prank and Vaiksaar, 2003). He used our earlier DOS program as a prototype, with many design principles derived from our earlier project (Prank and Viira, 1991). The student creates the solutions step by step. Each step consists of two substeps: (1) The student marks a subformula to be changed; (2) The student enters the formula for replacing the marked part (in Input mode) or selects from the menu a conversion rule to be applied to the marked part (in Rule mode). For each task the working mode (Rule or Input) is fixed in the task file and the student cannot change it. We describe here only the details of the Rule mode because this paper analyzes solutions of normal form tasks that were solved in Rule mode. At the first substep the program checks whether the marked part itself is a syntactically correct formula and then whether it is a proper subformula of the whole formula. These two cases are considered separately because the respective mistakes have differing reasons. Marking a syntactically incorrect part of an expression is not a very serious error – usually it is a result of careless work with the mouse or careless counting of parentheses in a complex formula. In our final tests we usually assign zero penalties for syntax errors. Marking a string of symbols that is itself a formula but not a subformula of the whole expression, however, is usually a misinterpretation of the order of operations and we take such errors into account in grading. For tasks with propositional formulas the main program offers the following menu of conversion rules:
1. ( X ) →← X
16. ¬( X ∼ Y ) →← X&¬Y ∨ ¬ X&Y
2. ¬¬ X →← X
17. X ∼ Y →← ( X ⊃ Y )&(Y ⊃ X )
3. X&Y →← ¬(¬ X ∨ ¬Y )
18. ¬( X ∼ Y ) →← ¬( X ⊃ Y ) ∨ ¬(Y ⊃ X )
4. ¬( X&Y ) →← ¬ X ∨ ¬Y
19. X →← X&Y ∨ X&¬Y
5. X ∨ Y →← ¬(¬ X&¬Y )
20. X →← ( X ∨ Y )&( X ∨ ¬Y )
6. ¬( X ∨ Y ) →← ¬ X&¬Y
21. X&(Y ∨ Z ) → X&Y ∨ X&Z
7. X&Y →← ¬( X ⊃ ¬Y )
22. X ∨ Y & Z → ( X ∨ Y )&( X ∨ Z )
8. ¬( X&Y ) →← X ⊃ ¬Y
23. ¬ X& X ∨ Y → Y
9. X ∨ Y →← ¬ X ⊃ Y
24. X ∨ X&Y → X
10. ¬( X ∨ Y ) →← ¬(¬ X ⊃ Y )
25. (¬ X ∨ X )&Y → Y
11. X ⊃ Y →← ¬( X&¬Y )
26. X&( X ∨ Y ) → X
12. ¬( X ⊃ Y ) →← X&¬Y
27. X ⊕ Y → Y ⊕ X
13. X ⊃ Y →← ¬ X ∨ Y
28. X ⊕ X → X
106
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
14. ¬( X ⊃ Y ) →← ¬(¬ X ∨ Y )
29. X ⊕ (Y ⊕ Z ) → ( X ⊕ Y ) ⊕ Z
15. X ∼ Y →← X&Y ∨ ¬ X&¬Y Rules 1–20 can be applied in both directions but rules 21–29 only from left to right. Rules 27–29 can be applied to conjunctions and disjunctions. Conversions do not work in “pure rewrite rule” style. We have tried to enable conversion steps that are similar to paper and pencil transformations; therefore the equivalencies behind rules 1–29 are implemented in quite generalized manner (which is similar to usual work with sums and products in algebra). We now describe the most important features added to avoid an excessive number of formal steps. In our text, if the number of a rule is shown as negative, it denotes that the rule is applied from right to left. (1) Conjunctions and disjunctions are treated as connectives that can have two or more arguments. Rules 3. . . 10, 21. . . 22, 27 and 29 (and rules −3. . . −6, −11. . . −18) can be applied to conjunctions and disjunctions with an arbitrary number of members. It is also permissible for the marked part to be part of a longer conjunction or disjunction. If a rule converts a disjunction or conjunction to implication or biconditional (rules 7. . . 10, −11. . . −14, −15. . . −18) and the marked conjunction or disjunction has more than two members, then the user is asked which conjunction/disjunction sign should be treated as the main connective of the marked part. Rule 1 allows canceling and adding brackets around an arbitrary part of a conjunction or disjunction. (2) The distributive rule 21 can be applied to any conjunction where at least one member is a disjunction (having an arbitrary number of members). However, at one step the distributive law is applied only to one disjunction. If the marked part is a conjunction that contains more than one disjunction then the user is asked which disjunction should be used for expanding. Rule 22 works in a dual way. (3) Rules 19–20 allow for adding to X one or more variables that occur in the initial formula of the actual task but do not occur in X ( X can be an arbitrary formula). (4) Rules 23–26 and 28 allow for excluding some members from a disjunction or conjunction. The marked disjunction/conjunction should have two or more members. The program asks the student to mark the members that should be excluded. At least one member should remain unmarked. For rules 23 and 25 the members marked for exclusion should contain two contrary literals (but can contain more members). For rules 24, 26 and 28 the marked members should be in certain relation with some unmarked member (for 24, conjunction of some unmarked member with some other subformula, etc.). Each rule allows exclusion of members only for the reason that is specific to this rule (only contradictory conjunctions, only conjunctions containing some unmarked member, etc.). (5) Rule 27 allows for reordering members of arbitrary conjunction or disjunction. The user is asked to select the first, second, etc., member of the resulting formula. To take advantage of features 1–5 the students should know that they exist; however, many students of the computer age prefer to push buttons without reading manuals or listening to lectures and instructions. Such students convert the formula ( A ∨ B )&C first to C &( A ∨ B ) and then apply rule 21 because the label on the button is X&(Y ∨ Z ) → X&Y ∨ X& Z but not (Y ∨ Z )& X → Y & X ∨ Z & X . In historical perspective this does not mean that educational use of rule-based environments is impossible because the full content of rules cannot be written to the button. We hope that after some (dozens of?) years the choice and behavior of algebraic rules will stabilize and the buttons will have universally recognizable icons. 5. Our version of the full DNF algorithm In our course we teach the following six-stage version of algorithm for conversion of formulas to full disjunctive normal form: (1) Eliminate implications and biconditionals from the formula; (2) Move negations inside until the only negations stand immediately before variables;
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
107
(3) Use distributive law to expand the conjunctions of disjunctions; (4) Exclude contradictory conjunctions (that contain the same variable with and without negation) and redundant copies of literals; (5) Add missing variables to conjunctions; (6) Put the variables in conjunctions in alphabetic order and exclude redundant copies of conjunctions. For each stage of the algorithm, the lecture also points out the equivalencies that enable one to accomplish the stage. For example, for stage 1 such equivalencies are
X ⊃ Y ≡ ¬X ∨ Y
and
X ∼ Y ≡ X&Y ∨ ¬ X&¬Y ,
and for stage 2
¬( X&Y ) ≡ ¬ X ∨ ¬Y ,
¬( X ∨ Y ) ≡ ¬ X&¬Y
and
¬¬ X ≡ X .
On the one hand, this provides proof of feasibility of the algorithm, and on the other it prepares the students for finding the rules for execution of the stage. We also explain that the menu of our exercise environment contains more than only these main equivalencies and in some circumstances use of some parallel rules can be an advantage. For example, if the biconditional stands under negation, then we can eliminate it using rule 16 and get ¬( X ∼ Y ) ≡ X&¬Y ∨ ¬ X&Y , instead of getting ¬( X&Y ∨ ¬ X&¬Y ) by rule 15. When we present the algorithm, we underline that it is essential to execute the stages in the prescribed order. Our favorite example here is the case when the distributive law (stage 3) is applied to a subformula that stands under negation, and after moving the negation inside we are forced to apply distributive law again in the opposite direction. We also emphasize that execution of any expression transformation algorithm in mathematics assumes continuous making of intermediate simplification steps, although this is not written in an explicit way in descriptions of algorithms. After demonstration of introductory examples and some independent work, we discuss more detailed aspects of execution of the algorithm that are not specified in the brief six-stage formulation: (1) Stage 1 – Equivalencies for elimination of biconditional create two copies of the operands of this operation, and therefore it is better to eliminate the implication first (if it stands inside the biconditional); (2) Stage 2 – If there are embedded negations then it is better to process the outermost negation first (it eliminates the negations in its argument); (3) Stage 3 – If a disjunction that will be multiplied using distributivity law contains tautologically false members then eliminate them before multiplication; (4) Stage 4 – It is reasonable to eliminate tautologically false conjunctions without preceding elimination of repeated literals in them; (5) Stage 5 – Rule 19 permits adding more than one variable if necessary. We also discuss the measures taken in the implementation of rules with the students to make the solution steps similar to working with paper and pencil (issues 1–5 in Section 4). 6. Information in solution files In this section we describe the input of our analysis tool. The main program is the product of a bachelor thesis. Although the student in question was an extraordinarily strong programmer and a good mathematician, the time restraints of the bachelor thesis put limitations on what was implemented and what was not. When a student starts solving tasks from a task file a new solution file is generated by the main program. The fresh solution file contains data from the task file (tasks, solution modes, allowed numbers of errors) and further the program adds data about the student’s attempts to solve tasks. The allowed number of errors means that if the student makes more errors then the program does not
108
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
Fig. 2. Table of results and direct errors. The name of the file and name of the student are partially erased.
count the task as solved and the student should solve the task once more. When compiling the task file the teacher specifies for each task whether the task has a fixed initial formula or the initial formula is generated randomly (using a prescribed number of variables and logical connectives). In our course one task file contains 15–30 tasks for one instructor-guided lab session (90 minutes) and for the following homework. Solution files are usually created at the beginning of a lab session and the students append them until all required tasks are solved. The students should submit their solution files through Moodle before the next lab session. In a test situation the file should be submitted before the end of test. Our additional restrictions for tests were that the student was allowed to solve each task only once and it was prohibited to create a new solution file; however, many students did not take the first restriction into account. The analysis tool analyzes all solutions in the student file, but we manually excluded any superfluous lines from the statistics table and left only the most serious attempt for each task. The solution file contains two sets of data about solutions: (1) A table of direct errors containing one row for each task: solved or not; numbers of attempts: all attempts, successful attempts, attempts where the number of mistakes exceeded the limit, interrupted attempts; numbers of errors: syntax, order of operations, misapplications of the rule/nonequivalence, presenting an unfinished solution; date and time of last attempt and last successful attempt. (2) Records of all solution attempts. Recording of information about errors is exercise-oriented: The program completely saves the solutions for all attempts but errors are saved only in a table where one row corresponds to one task. If the student has solved a task successfully then this row contains data about the first successfully completed attempt, and if not then the row contains information about the last attempt. The main program enables one to view the table with the task solution data and to open a new window with all solutions of the selected particular task. Fig. 2 presents the upper left corner of the table of results and errors. The solutions are recorded as chains of formulas without indicating the marked part and without reference to the applied rule. Recording solutions without step attributes is without a doubt a great deficiency on the part of the program. At the moment the lines are saved the program knew the markings and the applied rules; however, when looking at the solution review window, the instructor (or the analyze tool) is in the same situation as when grading the solution on paper. Fig. 3 presents a screenshot of the solution review window. To evaluate the expediency of solution steps, it is necessary first to decipher the origin of every step (conversion rule and its operands) and to compare the step with the solution algorithm and possible simplification activities. We can now imagine a group of 30 students presenting solution files
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
109
Fig. 3. Solution review window with a completed solution (11 steps) for a task with 2 variables.
with solutions for 20 tasks in a week. Some solutions are short – the average length is over 10 steps – and poor solutions contain 70 steps. At the same time, it is especially the poor solutions with their unwise steps that call for the instructor’s comments. 7. Designing the analysis tool The first part of the programming work was a module for deciphering the student’s steps. The deciphering module of the tool first finds the matching parts at the beginning and end of the initial and resulting formula of the step. The word between them in the initial formula is taken as a first approximation of the marked part. After that the analyzer finds the minimal subformula that contains the changed part. Using this subformula as a candidate of being the marked part allows to determine the applied conversion rule, except in cases where the step was made by abbreviation rules 23. . . 26 or 28. However, in our context it is not necessary to determine which of the abbreviation rules was used. It is quite natural to accept that all uses of them are reasonable steps. The tool also accepts any use of rules 1 and 2 in any situation. To compare the student’s step with the algorithm the analyze tool also determines what operation should be applied to the initial formula of the step according to the textbook algorithm. If the conversion actually applied does not correspond to the algorithm then the analysis tool qualifies the step to one of the predefined classes of mistakes. The first list of error classes was compiled as an expert opinion about experienced student errors, hypothetical misapplications of conversion rules, applications of steps belonging to other task types and deviations from the algorithm. This gave 15 classes (listed in the order of stages of the DNF algorithm): (1) Conjunction or disjunction expressed through implication (rules 6. . . 12 applied in wrong direction); (2) New biconditional added (rules 15. . . 18 in the wrong direction); (3) Biconditional expressed through implication instead of conjunction and disjunction (rules 17. . . 18 used); (4) Negation moved inside before finishing stage 1; (5) Negation moved out of brackets (rules 3. . . 6 in the wrong direction); (6) Processing of negation that stands under another negation; (7) Distributivity used too early (before the end of stage 2); (8) Distributivity applied in the wrong direction (rule 22 instead of 21); (9) Members reordered too early; (10) Reordering of members of a conjunction that should be removed; (11) Reordering in a conjunction that contains redundant members;
110
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
Fig. 4. Analyzer output regarding the solution presented in Fig. 3. Black rectangles are bounds of the changed part of the formula. Steps 2, 3 and 8 are annotated by error messages.
(12) (13) (14) (15)
Reordering of members of a disjunction (as for conjunctive NF); Variables added too early; Variables added to a subformula that is not a conjunction of literals; Variables added using rule 20 instead of 19 (as for conjunctive NF).
The current version of the analyzer is written in Free Pascal and has a very basic user interface. It runs in a text window and displays the following information about the next step after each of the user’s keystrokes:
• • • •
Number of the step, number of the stage in DNF algorithm, some further clue about the rule; Specification of the applied rule and OK if the step was acceptable; Error message in the form of a hint if the step was not acceptable; The initial and resulting formula where the changed part and the resulting part are highlighted.
The same information is written in a text file to allow it to be read in less dynamic conditions. After each solution attempt the analyzer displays statistics about this solution and the summarized statistics at the end of solution file. Fig. 4 presents the lines of the analysis file that correspond to the solution presented in Fig. 3. The analyzer also permits processing a set of solution files (for example, files of all participants of one lab session). In addition to particular analysis files, we also get a text file with data of the group. Each line in this file contains data about one solution attempt: the name of the file, task type, number of steps, number of steps taken back, what state was reached, overall number of mistakes, number of
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
111
Table 1 Statistics received from version 1 of the analyzer. Diagnosed inexpedient steps in the FDNF task for the final test in the autumn term 2011 (162 solutions, 7270 steps). Deviation
Mistakes
Students
1. & or ∨ converted to implication 2. New biconditional added 3. Biconditional expressed through implication 4. Negation moved inside before finishing stage 1 5. Negation moved outside of brackets 6. Inner negation processed first 7. Distributivity used too early 8. Distributivity applied in wrong direction 9. Members reordered too early 10. Members of a false conjunction reordered 11. Reordering applied to redundant members 12. Reordering applied to members of disjunction 13. Variables added too early 14. Variables added to a subformula of unsuitable form 15. Variables added using rule 20 instead of 19
22 1 8 61 142 174 153 21 330 82 68 30 147 5 5
11 1 8 36 43 58 53 12 64 27 32 13 43 3 4
mistakes by types. After copying the data from this file to a spreadsheet we can classify and sort the solutions by various attributes and calculate statistics of solution lengths, numbers of mistakes and other quantities. For example, after separation of finished and unfinished solutions we discovered that every test participant who completed stage 3 in the algorithm also completed the entire FDNF task. The author was not confident that the above-described design of the analyze tool, which is quite formal and syntax-driven, enables one to catch mistakes and to create intelligible error messages/hints. To evaluate the first version, the analyzer was used for scanning solution files to the final test for Elements of Discrete Mathematics in the autumn term of 2011. Together with other tasks (truth-table tasks in another exercise environment, some written tasks) the test contained two expression manipulation tasks: expression of a formula using connectives {&, ¬}, {∨, ¬} or {⊃, ¬} (different student groups had different sets) and conversion to full DNF. For both conversion tasks the initial formula was generated randomly and contained 3 propositional variables, one occurrence of each of the four binary connectives, and 2 negations. There were 185 test participants and 162 of them submitted a solution file. There were 132 completed and 30 partial solutions to the DNF task received. The analyzer was used for checking the solutions to the DNF task. The first version of the analyzer diagnosed a surprisingly large number of deviations from the algorithm: 1249 inexpedient steps of the 7270 steps made in the students’ solutions. However, a human analysis of the output files of the analyzer showed that virtually all annotated steps were indeed at least not the best choices for conversion. Table 1 displays the classes of diagnosed deviations together with numbers of deviations and numbers of students who made that mistake. The most frequent deviance from the official algorithm was usage of a conversion rule when a previous stage of the algorithm had not yet been completed (lines 4, 7, 9, 13 of Table 1). This can happen for different reasons. In some cases the formula really contains distinct parts that can be processed in any order. It is possible improve the diagnostics and avoid displaying error messages in (many of) these cases. The author is not sure that this is the right way to go. If we study several chains of conversions of the same student then we see that some students simply do not know the algorithm (even during the final test). They select from the menu arbitrary rules that happen to be applicable, including the rules that make no sense in the DNF algorithm. A good example of this is moving the negation out of brackets. Other students seem to ignore the prescribed order because they think that the order is not important. In both cases a more finely tuned analyzer would issue messages about some deviations from the algorithm and “forgive” others. This would be didactically worse than the existing rough version of the analyzer. If we consider incorporating the analyzer (and its messages) in the main program then it seems to be appropriate to display the messages about deviation from the algorithm every time the deviation exists. However, the program could give the student an opportunity to ignore the message or take the step back.
112
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
Table 2 Diagnosed algorithmic deviations in a test of first-term students and in a session of third-term students. Quantity/mistake
DME
IML
FDNF
FCNF
NF
Number of students Tasks Steps Steps taken back Errors 1. & or ∨ converted to implication 2. New biconditional added 3. Biconditional converted to impl 4. Negation moved inside brackets at stage 1 5. Negation moved outside of brackets 6. Inner negation processed first 7. Distributivity used too early 8. Distributivity applied in wrong direction 9. Members reordered too early 10. Members of a FALSE conj reordered 11. Reordering of redundant members 12. Members of disj reordered in DNF/ members of conj reordered in CNF 13. Variables added too early 14. Variables added to arbitrary subformula 15. Variables added to disj in DNF/conj in CNF 16. Only a part of conj/disj reordered 17. Only one variable added instead of two 18. Unnecessary brackets/negations added 19. Addition of variables not required 20. Formula is already in the required form
162 1 7270 933 1481 23 1 8 56 157 219 84 17 327 81 68 197
43 26 20 718 1536 3239 45 7 34 280 199 204 169 245 807 45 32 258
43 12 7922 537 1227 12 2 26 77 110 134 92 28 239 31 18 174
43 10 7675 628 1119 28 4 0 100 72 41 24 112 357 10 2 51
43 4 5121 371 893 5 1 8 103 17 29 53 105 211 4 12 33
147 5 5 45 3 33 0 5
319 11 64 91 28 44 208 149
117 9 22 69 15 19 0 33
202 2 42 15 13 14 0 30
0 0 0 7 0 11 208 86
Observation of the analyzer’s output and analysis files convinced us that our provisional design of the analysis is usable. We saw that the analyzer’s messages could improve the performance of the main program. Even the first version gave us an idea of the number of mistakes and an indication of more widespread mistakes. The search for mistakes and unwise steps that were not diagnosed by the first version gave us additional error categories 16–18 and 20. Line 19 is added for tasks on finding non-full disjunctive or conjunctive normal form (without the requirement of adding variables to conjunctions/disjunctions). (16) (17) (18) (19) (20)
All variables of conjunction can be sorted in one step. All missing variables can be added in one step (in stage 5). Unnecessary brackets or negations added. Addition of variables is not necessary in this task type. Formula is already in the required form.
Some programming errors were also corrected in the second version. In particular, diagnostics of deviation 12 was changed. 8. Comparison of first-term and third-term students In this section we present Table 2 with the results of applying the second version of the analysis tool to two sets of data. The first set contains the same solutions for the test from first-term students (column DME). The second set contains files from a second-week lab session and the following homework for a third-term course, Introduction to Mathematical Logic, in the autumn term of 2011 (column IML). The task file contained 12 tasks of transformation to full disjunctive NF (column FDNF), 10 tasks for full conjunctive normal form (FCNF) and 2 + 2 tasks on disjunctive and conjunctive normal form (NF, without the requirement to add variables to conjunctions and disjunctions).
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
113
In analyzing the data of younger and older students we can draw several conclusions:
• Numbers of algorithmic faults from both categories of students are high. The analysis of inex•
• •
•
pedient steps in solutions confirms that the alarm caused by the large size of solution files was founded. Already during the exercise lab the quality of solutions from third-term students is better than the quality of solutions from first-term students after training for FDNF tasks. Third-term students made 0.16 mistakes per step and first-term students 0.20 mistakes. The number of steps that were taken back are at nearly the same ratio. Working with both DNF and CNF task types confused the third-term students and resulted in an increase of mistakes 8 and 15. Large numbers in the CNF column (357 on line 9 and 202 on line 13) are also caused by reordering or adding members in conjunctions that are members of disjunctions, i.e., the students had in mind the DNF pattern. The “too early” message would be only formally correct (reordering or addition of a new variable took place before application of distributive law). However, the real reason seems to be confusion of CNF and DNF patterns. It seems that most students did not read the texts for the tasks and always solved the full DNF/CNF task type. Almost 300 errors would be prevented on lines 19 and 20 if the solution environment would give messages when the student solved the first non-full NF task.
9. Number of steps and numbers of mistakes It is quite natural to ask what algorithmic deviations are typical for the students who compose long solutions. For this question we calculated correlation coefficients between the numbers of steps and numbers of diagnosed errors of particular error types. It is easy to guess that the number of steps in solutions stands in strong correlation with the number of steps taken back and the overall number of mistakes. These two quantities simply measure how well the student knows what he/she should do. Even so, we had no expectations about the particular deviation classes. The earlier impression was that different students could have rather variable “favorite” ways of erring. Table 3 is based on the same data discussed above from the final test of first-term students (DME) and practical session of third-term students (IML). The first and third columns contain summarized numbers of steps and numbers of different types of mistakes. The second and fourth columns are correlation coefficients between numbers of steps made by the students and numbers of concrete type of mistake. In the case of the test, the students solved only one task where the initial formula was randomly generated. This randomness corrupts the statistics because for different initial formulas the minimal length of solution differs by about two times and the possibility of making specific mistakes depends on the initial formula (for example, the positions of the sign of biconditional and negations in the formula). Therefore, we primarily discuss the numbers in the last column here. (1) The highest coefficient of 0.72 is associated with moving negation outside of brackets. The student is moving it directly in the opposite direction compared with the algorithm. This mistake demonstrates that the student does not understand the ideas behind the algorithm and there is a high probability of other missteps as well. (2) Despite the hope that the “too early” messages could appear without a real mistake (Section 7) we see a coefficient of 0.65 for early reordering and 0.57 for early adding of variables. The importance of these mistakes is also visible in column 2. The natural implication of this is that displaying the corresponding error messages should be made obligatory when we incorporate the analyzer in the main program. (3) Application of distributive law in the wrong direction has a coefficient of 0.58. This mistake again means poor understanding of the goal (DNF or CNF). The best way to cope with the results of this mistake is just to take the step back. Other ways are more costly (in terms of steps). (4) There is one more mistake with a high coefficient – processing of innermost negation before outermost (0.54). This seems to be merely a technical detail regarding optimal usage of the algo-
114
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
Table 3 Correlation between the number of steps and the numbers of error types. Quantity/mistake
DME
Number of students Tasks Steps Steps taken back Errors 1. & or ∨ converted to implication 2. New biconditional added 3. Biconditional converted to implication 4. Negation brought into brackets at stage 1 5. Negation moved outside of brackets 6. Inner negation processed first 7. Distributivity used too early 8. Distributivity applied in the wrong direction 9. Members reordered too early 10. Members of a FALSE conjunction reordered 11. Reordering applied to redundant members 12. Members of disj reordered in DNF/ members of conj reordered in CNF 13. Variables added too early 14. Variables added to arbitrary subformula 15. Variables added to disj in DNF/conj in CNF 16. Only a part of conj/disj reordered 17. Only one variable added instead of two 18. Unnecessary brackets or negations added 19. Addition of variables not required 20. Formula is already in the required form
162 1 7270 933 1481 23 1 8 56 157 219 84 17 327 81 68 197 147 5 5 45 3 33 0 5
Correl
IML
Correl
1.00 0.69 0.66 0.17 0.06 −0.04 0.20 0.11 0.55 0.27 0.11 0.55 0.05 0.07 0.20
43 26 20 718 1536 3239 45 7 34 280 199 204 169 245 807 45 32 258
1.00 0.55 0.78 0.26 −0.04 0 0.29 0.72 0.54 0.30 0.58 0.65 0.16 0.21 0.25
319 11 64 91 28 44 208 149
0.57 −0.04 0.32 −0.13 0.05 0.42 0.42 0.17
0.38 0.03 −0.01 0.06 −0.08 0.06 0.12
rithm; however, statistics indicate that this is a mistake that characterizes poor transformations and should consequently be annotated. This is confirmed by the number 0.55 in the second column. If we try to find a common denominator for issues 1–4 then it is probably the importance of understanding the mathematical side of the algorithm. It seems that concrete details for using the rules of the exercise environment (like lines 16–17) in the table are much less important. 10. Conclusions and ideas for further work This paper described a project for which we designed, implemented and tested on real data an additional tool for annotating the solutions to normal form exercises in propositional logic. We can draw the following conclusions:
• Our rather formal approach to seeking and classifying errors proved to be useful and resulted in a program that helps instructors to check students’ solutions.
• We knew that large sizes of solution files are really caused by large numbers of duly recognizable mistakes.
• We collected quantitative data about the appearance of different types of algorithmic mistakes in student solutions. We now know which mistakes appear frequently and which mistakes have a strong correlation with the length of solutions. This helps to improve our instruction. • We have explained that in our course the main equivalencies are taught through exercises on expression, using given connectives, while the normal form algorithm is taught through exercises on normal form. Learning of equivalencies has been supported by feedback for the past 20 years. In addition, we now have a tool that composes feedback about the conformity with the normal form algorithm. • We now have a tool that helps to measure the impact of future changes in our instruction.
R. Prank / Journal of Symbolic Computation 61–62 (2014) 100–115
115
It is clear that the numbers of mistakes, correlations and other characteristics depend on concrete teaching and on the students. We will repeat the calculations on the data of next years and also examine our (unfortunately incomplete) data from the past. We hope to have the main program upgraded by the autumn term of 2013. In 2012 we still used the exercise environment as it was created in 2003; however, we made the analyze tool available to students and required that they submit a solution file with a small number of algorithmic mistakes. We hope that this is a signal for the students to improve their skills. Our project for rule-based conversions has also led to some progress towards evaluation of expediency in input-based modes. As described in Section 7 the solution files do not contain information about the applied conversion rules. The analyzer restores the rules using only the two successive formulas. What happens if the solution is composed in an input-based environment? We tested it with files containing solutions to tasks on expression of formulas via negation and one binary connective. Our students solve these tasks in the input-based mode. The most frequently inefficient step here is using a third intermediate connective because the student does not know the direct conversion rule. The result was better than expected: Even without any correction of recognition of rules, almost all steps were understood correctly. Although students have the possibility to make arbitrary steps, the real data show that practically all steps are applications of only one conversion rule from the textbook. There is one widespread exception – additional removal of double negations from the result of the step. Fortunately, addition of the capability to handle this exception does not require considerable programming effort. This study dealt with expression manipulation exercises in propositional logic. The author believes that an expediency analyzer can also be created for many task types in elementary algebra. References Aplusix, no date. http://aplusix.com/. Last visited 1.04.2013. Beeson, M., 1998. Design principles of Mathpert: Software to support education in algebra and calculus. In: Computer-Human Interaction in Symbolic Computation. Springer, pp. 89–115. Buchberger, B., 1990. Should students learn integration rules?. SIGSAM Bull. 24, 10–17. Issakova, M., Lepp, D., Prank, R., 2006. T-algebra: Adding input stage to rule-based interface for expression manipulation. Int. J. Technol. Math. Educ. 13, 89–96. MathXpert, no date. http://www.helpwithmath.com/. Last visited 1.04.2013. Nicaud, J., Bouhineau, D., Chaachoua, H., 2004. Mixing microworld and CAS features in building computer systems that help students learn algebra. Int. J. Comput. Math. Learn. 5, 169–211. Prank, R., 1991. Using computerised exercises on mathematical logic. In: Informatik und Schule. In: Inform.-Fachber., vol. 292. Springer, pp. 34–38. Prank, R., 2006. Trying to cover exercises with reasonable software. In: Second International Congress on Tools for Teaching Logic. University of Salamanca, pp. 149–152. Prank, R., Issakova, M., Lepp, D., Tõnisson, E., Vaiksaar, V., 2007. Integrating rule-based and input-based approaches for better error diagnosis in expression manipulation tasks. In: Symbolic Computation and Education. World Scientific, pp. 174–191. Prank, R., Vaiksaar, V., 2003. Expression manipulation environment for exercises and assessment. In: 6th International Conference on Technology in Mathematics Teaching. Volos-Greece, October 2003. New Technologies Publications, Athens, pp. 342–348. Prank, R., Viira, H., 1991. Algebraic manipulation assistant for propositional logic. Comput. Logic Teach. Bull. 4, 13–18. T-algebra, 2007. http://math.ut.ee/T-algebra/. Last visited 1.04.2013.