Computer-assisted learning and groupwork: The design of an evaluation

Computer-assisted learning and groupwork: The design of an evaluation

Computers Educ. Vol. 17, No. I, pp. 41-47, 1991 Printed in Great Britain. All rights reserved 0360-1315/91$3.00+ 0.00 Copyright 0 1991Pergamon Press ...

718KB Sizes 0 Downloads 52 Views

Computers Educ. Vol. 17, No. I, pp. 41-47, 1991 Printed in Great Britain. All rights reserved

0360-1315/91$3.00+ 0.00 Copyright 0 1991Pergamon Press plc

COMPUTER-ASSISTED LEARNING AND GROUPWORK: THE DESIGN OF AN EVALUATION ERICAMCATEER,‘* TONY ANDERSON,’ MARGARET ORR,~ AYAL DEMISSIE’ and EVANS WOHBREM’ ‘Department of Psychology, University of Strathclyde, Alexander Turnbull Building, 155 George Street, Glasgow Gl IRD and *Govan High School, 12 Ardnish Street, Glasgow G52 4NB, Scotland (Received

13 February

1991; accepted

14 February

1991)

Abstract-This paper describes an evaluation methodology which directly addresses the issue of process. The principal focus of the study is the effect of software variables on the patterns of interaction within pairs of users. The paper therefore explores methodological issues concerning how to describe and characterise interaction, and the various design choices faced in a study involving such analyses. Several frequently-encountered dilemmas are raised, one major one being the scope of an evaluation. This is raised in several forms; for example: the trading of depth (in the fine detail of a coding scheme for describing subject and the machine behaviour at the microanalytic level) against breadth (in terms of the maximum numbers of subjects that can be run in a study involving such detailed analyses), the sampling of cross-sectional as opposed to longitudinal data, and the fact that particular design choices, whilst facilitating the addressing of specific research questions, inevitably constrain one’s ability to address other, equally pressing, issues.

INTRODUCTION

There are many possible issues that an evaluation could be expected to address, including: (a) The quality of individual CAL programs as a means of delivering teaching/learning material compared to the general standard of available CAL software. Within this, issues concerning human-computer interaction are often a crucial focus of evaluation. (b) The quality of computer based teaching with traditional methods. Here the evaluative locus broadens to the learning situation itself. (c) The role of teaching software within studies of educational practice, where comparisons of teaching programs and comparisons of teaching strategies may receive equal focus within a research program. Given the wide variety of possible purposes underlying an evaluation, there is an equally diverse range of evaluation techniques within which the emphasis shifts between task outcomes and task processes. Whilst the computer is often seen as an ideal vehicle for individualising instruction, it transpires that in practice, much use of computers is with groups of pupils rather than individuals, due to the relative scarcity of computers in schools[l,2]. This is particularly true in our experience of certain subject departments: English language, for example, and even mathematics[3]. If one acknowledges the force of Ridgway’s argument [4] that evaluation should concern representative uses of CAL, it seems reasonable to address important educational issues relating to group performance or task collaboration within the context of computer supported learning. Crook [1] speculated that characteristic patterns of interacting with computers may serve to organise distinctive patterns of interacting around computers, voicing a need for research that “pays attention to task structures and the way in which they promote different styles of interaction”. This research project attempts to address this question, seeking those elements of interaction that promote effective collaboration in terms of task performance, although postponing issues concerning the relationship between performance and learning. This case study will focus primarily on the methodological considerations raised in designing a study to address the issue of the effect of different types of CAL on patterns of interaction amongst their users. It is not primarily about the results obtained from the study. We would argue that, *Author for correspondence. 41

42

ERICA MCATEER et al

since groupwork around the computer in such a common pattern of usage, the development of an appropriate research technique for measuring and characterising the effect of software on the processes of group interaction around the computer is a vital precursor to the making of judgements about software quality that the term “evaluation” implies. The issues which we wish to address in the study clearly cannot be approached from an analysis of task achievement measures alone; a methodology involving an analysis of the communicative interactions of the participants within the task situation, via some coded representation of those interactions within the situational context, is necessary. This method can be fraught with difficulty-in particular, that of bargaining depth against clarity. If the interactional elements are coded in fine detail, it can be difficult to “see the wood for the trees”-establishing which of the wide variety of possible features of interaction are actually functional for task performance is very difficult. If, on the other hand, initial transcription and coding relates only to broad behavioural categories (an example within this context would be indicating which partnership strategies are in use-turntaking, specific role allocation, etc.), then valuable information is obscured. Great care is therefore necessary in the design of the measures in such a study. Like many of the contributors to the workshop, we have a strong interest in studying computer-assisted learning in a normal context of use. In this particular case, this means studying pairs of users (pairing being a popular grouping practice for microcomputer use in schools). The pupil subjects in our study were secondary schoolchildren with learning difficulties, a group for whom computer supported teaching is thought to be particularly efficacious [5], and for whom issues relating to effects of group interaction upon task performance are crucial. One major focus for the study is the comparison of the patterns of interaction occurring around two different types of software: “open”, where the means of achieving the task goal are under the subjects’ control; “closed”, where the program requires specific initiation or response moves from subjects-that is, not only is the task goal specified, but there is only one route to its achievement, which the system controls. In addition, we also wished to compare the interactions among dyads of pupils with those of dyads comprising a teacher and a pupil. Obviously, a one-to-one partnership between pupil and teacher is impractical under normal classroom conditions. However, it is exactly what can obtain within Learning Support departments. On the other hand, a strategy of mixed ability pairings generally favoured by teachers[4] cannot easily be adopted for pupils with learning difficulties without unnecessary disruption of normal classwork for more able pupils; if peer pairing is used, the participants are likely to be at a similar (and low) level of skill. The overall design of the project therefore involves manipulating the nature of the software used and also the nature of the user partnership (teacher-pupil vs pupil-pupil). The broad aims of the project are: To provide a fuller understanding of task performance processes taking place among groups of pupils at the microcomputer interface with a view towards advancing the general theory of constructive interaction[6] and ascertaining the efficacy of computer assisted learning in promoting this. To investigate differences in styles of interaction between teacher-pupil and pupil-pupil dyads when working with different types of software (open vs closed), and to elucidate the relative contributions of teacher guidance and of peer collaboration in facilitating successful task outcomes. To identify factors contributing towards any possible differential effects of complex games/simulations as opposed to open-ended microcomputer software with regard to pupil learning, using teaching programs from different topic domains.

METHODOLOGY

The design of this study involved teacher-pupil and pupil-pupil dyads working on open and closed (as defined above) English language and maths teaching software, using two examples of each of the four types of program.

Evaluation of CAL and groupwork

43

Ten teachers and 30 pupils acted as subjects in the experiment, giving 10 teacher-pupil pairs and 10 pupil-pupil pairs. Five of the teachers were English teachers, and five maths teachers. All subjects in the study came from four schools within the Glasgow education division. Only the same-gender pairings were used, with an equal number of male and female dyads in each group. Each dyad was given a series of three tasks on each of eight pieces of software categorised along an open-closed continuum; the experiment was run in two sessions, one involving the use of four English language tasks, the other involving four maths tasks. The tasks set for subjects were devised in conjunction with the learning support teachers at the schools involved, and were designed to be within the subjects’ potential capability, while presenting three increasing levels of difficulty. For example, one of the items of software used was a word-processing system, and three tasks of increasing difficulty were devised in conjunction with a teacher. The first was to write a letter to their registration teacher, explaining a recent absence from school. The second was to write to the researcher who had supervised their practice sessions to tell her about a television programme they had both liked. The final task was to write a letter to a friend in hospital describing a recent school outing. Similarly graded tasks were devised for all items of software, both maths and English, with the aid of teachers. The closed software tasks were, of course defined by the nature of the teaching content; the use of a set sequence of tasks with open software ensured that all subjects were attempting tasks of equivalent difficulty, and made scoring and comparing the relevant data easier. Throughout the experimental sessions, the subjects’ interactions both with each other and with the software were videotaped. The video record of the subjects allowed transcription of gesture and keyboard use; the second video recorded the ongoing screen events as the taks progressed. Audio recording of dialogue was common to both tapes, to allow real time sequencing of interactional events. Design The independent variables in the study are topic domain and task software type as within subjects factors, and dyad type, teacher expertise and partnership gender as between subjects factors. The dependent variables are task progress and task achievement, measured by independent assessment of finished work on open tasks and task completion weighted by error rates on closed; software assessment by all participants via rating scales requested immediately following each session; and interaction protocols derived from the audio visual records of each experimental session. THE

DATA

Each session in the experiment lasted an hour, giving 15 min use of each program. Although this is a short period, it provides copious quantities of data when the analysis is microanalytic in nature. One major issue is simply how to analyse the interactions among members of groups of pupils. Aside from the difficulties of interpreting what the meaning of purpose of particular uttepnces or actions by the participants are (a subtle and difficult problem; see [7]), the sheer complexity of the data alone is overwhelming: the most striking feature of audio-visual records of group interaction is their exceptionally rich content. This makes the very selection of variables for scoring and comparing across conditions extremely difficult-one cannot simply select whichever variables seem promising at first sight, since the large number of possible comparisons makes random differences quite likely[8]. This problem is exacerbated if the investigator wishes to code gestural behaviours in addition to the verbal dialogue among the subjects (as is the case in the present study). Furthermore, the computer must be considered a participant in the interaction: not only because of its inherent interactivity (issuing prompts, providing various levels of feedback to the pupils’ responses, and so on), but also because the subjects occasionally interact with each other via the computer screen as well as through speech and gesture: suggestions or instructions from one participant to another can be agreed or countered through the keyboard. Also, obviously, the current state of the ongoing task is indicated by the screen, which therefore focuses the whole interactional event. For all these reasons, coding of the machine’s activities is necessary to even begin to do justice to the complexity of this learning situation. We found, in line with previous

ERICAMCATEERet al.

44

research, e.g. [9], that participants’ speech and gestural behaviour was almost exclusively task orientated. Since this was, of course, also true of the “actions” of the teaching program, as expressed onscreen, it was readily possible to code the machine’s actions in the same items as some of those of the human participants. THE

CODING

SCHEME

The critical issue for an analysis which can account for the communicative triad of user partnership and system within each task situation is the derivation of a coding scheme which adequately expresses all the session records, under their different manipulations of partner type, software and task. In our system, the basic unit of analysis is the move--comprising anything from (for example) pressing return to get up the next piece of information from the screen, to reading out or summarising the whole of a passage of text already created by the partnership. A full list of move codes is given in Table 1. Dialogue moves can be grouped into superordinate categ0ries-e.g. “tell” could incorporate such moves as command, suggest, explain, direct, describe etc., while “elicit” could incorporate query, ask, rhetorical question, etc. Doing so simply involves incorporating another letter and using it as the flag for frequency counting. For gestural coding, we concentrated only on gross gestures, turning to partner, point at the screen etc. There was, in any case, very little evidence of the use of facial expression or gaze-most of the time both partners faced the screen. (This is, of course, a feature of human communication within such a setting and the points at which subjects do turn to each other, allowing full use of the normal communication channels of conversation, are themselves likely to be informative.) It is very difficult indeed to code behaviour into socio-linguistic categories from transcripts. This is particularly so where there are codings of data from a number of different modalities-subjects’ dialogue, gestures, keyboard actions, and ongoing screen events dealt with as separate entities without the cues from context that are present in the real-time interaction of the whole event. Initially the coding was achieved via the full transcription and subsequent coding of a sample of the data; these procedures were both time consuming and labour intensive, but once the coding structures were established and after research staff training the intermediate transcription stage was dropped and protocols derived directly from videotapes. Having devised a coding system which could express interactive moves at an appropriate level of detail, we needed to describe which agents (i.e. either of the human participants or the computer) made which moves, assigning moves to agents in such a way that the channel of each unit is tagged-as dialogue, screen event or gesture. We also needed to describe the sequential pattern of moves between partners: how many turns, whether the type of move differed between agents,

Table I. Move category codes. In our coding scheme, the move is the basic umt of analysis, and there are different types of move (dialogue moves, VDU moves, and gestural moves). The computer’s “actions” are coded in some of the same terms as the dialogue moves of the human interactants. The full list of categories of moves with their corresponding codes in our coding scheme is given below corn cxP des ask inf ins cpo/cne/cnu acq cnf ctr sum ech red cha rxx* rev PS fk

*.x = type

Dialogue and VDU command sug explain dir describe qux’ seek information ans inform-task content del instruct rhe a”” comment acquiesce agr confirm CO” cox* contradict summarise rpt echo (other’s act) fdb read aloud pro erx’ change content ass respond mod revise Gtwure point at screen pk seek kevboard control to

suggest direct query respond with info. delete rhetorlcal question announce action agree counter correct repeat (own act) feedback prompt error assess model point at keyboard turn to other

Evaluation of CAL and groupwork

45

whether the pattern changed over time within a task, etc. Finally, we had to represent the focus of each move. Most of the dialogue (as others have found in this sort of situation) was concerned almost totally with the task itself. We differentiated between “task content” units, which related to the actual exercise set, and those which related to the pragmatics of achieving the task in that particular environment (e.g. “Do you use the arrows to move it?‘), as this distinction is important when making assessments in the context of computer based learning. We also wanted to distinguish non task related dialogue units which are-apparently-nothing to do with the ongoing situation, e.g. “who have you got first period after lunch?“. Each move, therefore, is prefixed by a number, indicating its focus (1 = task, 2 = pragmatics, 3 = aside), then with a letter indicating the agent of the move-L(eft) or R(ight) for dyad members (or T and P for teacher and pupil). The code string would be headed by a letter indicating channel-V(DU), D(ialogue) or G(esture). The resulting “interaction protocols” express sequences of situational interactions within their distinct task x software x dyad type settings. Table 2 defines the structure of the code strings, and Table 3 gives an example of a segment of protocol, with translation. ANALYSIS Having established the coding system and applied it to the data in order to derive a set of protocols expressing the interactions between partners and micro for each of the sessions, the issue becomes one of counting the events and making comparisons in terms of the experimental manipulations. We have developed a simple event frequency counter which enables us to set up a query file (a list of the questions we want to ask); the event analyser can then be used to count the relevant data in all protocols (or any listed subset, if desired). The system is surprisingly flexible, given its simplicity. We can run an analysis of interaction styles between groups, by counting and comparing total moves (with subtotals by type); number of turns; moves per actor per channel; number of sequences (which in our context means number of task stages/subgoals achieved); how many actor moves within turn, and how frequent such series were for each actor within a session. We can make either a broad or a fine analysis by focussing on the initial elements of the strings only, or comparing frequencies of actual moves. By looking at interactional differences between types of dyad, under different conditions of task and topic, against task performance scores, we can ask whether there are specific strategies, peculiar to partnership type, which foster successful performance. We can also assess the effectiveness of the various types of software, within their subject headings, for promoting educationally beneficial interactions within partnerships. The major advantages for this technique are: it provides equal representation of all communicatory elements in the task situation: the active participants, human and system, as well as the media through which they communicate: dialogue, gesture or screen. This reduces complexity both for coding procedures and for interpreting the protocols. Table 2. Breakdown

of protocol

code structures

Within a particular settingthat is, task/software/dyad type distinctions, we have sequences of situational interactions including moves within turns related to task goals where the focus is on one or another aspect of the task. The units of code indicate relevant elements of these interactions, as follows. First letter: the “channel” or “medium” of the inter: ction: dialogue, the speech of the participants; VDU simply, the screen-the channel within which the system and the subjects interact; gesture, the non-verbal behaviour of the subjects where this is seen to be functional to the interaction, whether overtly communicative or not. Second letter: the “actor” in the interaction, e.g. Lefthand/Righthand partner, (or Teacher/Pupil), System or Experimenter. Number: the focus of attention of the interaction--either the task content itself (I); the pragmatics of proceeding with the task (2); or social comments or asides made in the task situation (3). Final string (lower case letters): the codes for the linguistic expressions-three letters following channel code D or A and actor. For the nonlinguistic expressions: two letters following G and actor. Connections between moves are expressed by relational signs --sequential move within turn > turn change = simultaneous moves () break within sequence /sequence change within setting

46

ERICA MCA’TEER et al. Table 3. Section of coding expressing interaction between a pupil-pupil dyad with a language open-task program. See Table 2 for a full explanation of the protocol codes PP8mic VSlpro > GRto - DRlred = DLlred > DRlinfz DLlcou > DRlsug > VLlcou - DLlask > DRlrep - DRlinf > DLZque > DRZacq-VRlech-DRlque>DLlinf>VRlcor The system asks for the next part of a story: “What did he/she look like?’ The right hand pupil turns to her partner. They both read the prompt aloud. The right hand pupil provides information “tall with fair hair” The left hand pupil counters this “no, darkish-fair” The right hand pupil makes a suggestion “dirty-blonde?” The left hand pupil types in a different adjective “mouse coloured” The left hand pupil asks for more information “what else is she like?’ The right hand pupil repeats earlier information and adds to it “tall she has hazel eyes” The left hand pupil asks a procedural question “do you want to do this bit?’ The right hand pupil responds then begins to type. She queries the spelling of a word “is it s”? The left hand pupil responds “zed” The right hand pupil makes the correction.

The flexibility of analysis permitted by the event frequency counting facility, despite its functional simplicity, allows various comparisons to be drawn using the data-e.g. English vs maths software, open vs closed tasks, guidance vs cooperative collaboration, male vs female pairings. The hierarchical nature of the instrument, allows both a broad and/or a narrow focus to the analysis. Data analysis can address such overall cooperative issues as role taking (“you type, I’ll talk” or “you do this one and I’ll do the next”) or specific guidance strategies adopted by teachers (prompting, for example, or rhetorical questioning) and seek their effect upon task performance variables. DISCUSSION The design of the experiment represents an attempt to examine the processes of interaction around the computer, as a function of the type of dyad of subjects involved, the type of software (open or closed) and the topic area (English or maths). In addition to using custom-designed preand post-test measures of learning and measures of progress in the ongoing tasks, a major focus for analysis is on the set of measures which reflect the interactions that took place (both between the subjects themselves and also between the subjects and the computer), and the relationship between the latter data and the task progress scores. This set of measures will provide us with a detailed cross-sectional picture of the fine-grained interactions generated by a relatively small number of subjects using a small number of different CAL programs. An alternative approach would have been to undertake a more longitudinal study by applying the same basic methods to the same number of dyads using perhaps only two programs on several occasions over an extended time period; this would perhaps permit a much clearer grasp of dyadic interactions around the computer in relation to learning outcomes as opposed to task progress scores (after all, it is relatively unlikely that substantial learning gains will accrue as a function of 15 min use of a program). The resource limitations inherent in research inevitably result in a depth/breadth tradeoff: the more deeply the subjects’ activities/learning are studied, the fewer the subjects that can be studied.

Evaluation of CAL and groupwork

47

Lawler’s study[lO] of his young daughter learning to use Logo represents one extreme of the “deep” approach-one subject is studied in intimate detail over a prolonged period providing rich and insightful data. But inductive generalisations are notoriously difficult, and when the sample is of one subject only, the problems are severe. The other extreme on the breadth/depth dimension is to measure only small amounts of data (perhaps only pre- and post-test scores) from a much larger sample, making inductive generalisation much less of a problem, but losing much informative data, particularly as regards the processes of learning. Our attempt to steer a middle course between these two extremes allows us to make quantitative comparisons of interaction processes for the basic manipulations of dyad type, software type and topic, but reduces finer comparisons (e.g. dyad type and open vs closed tasks within one topic domain) to very small groups. Adding a further comparison (e.g. gender, or teacher expertise) reduces any description of findings to the category of case study. Avoiding this disadvantage forces us either to ignore interesting questions that can be put to the data we hold, or to increase subject numbers-and thus transcription and protocol analysis load-to a cost level beyond the funding constraints of the research. The project we have described seeks to combine experimental rigour with a qualitative, observational approach, using both nomothetic and idiographic methods [l l] to investigate computer assisted learning within a learning support context. Several evaluation strategies are involved: pre- and post-testing, a number of experimental manipulations and a complex record of the processes of interaction within the learning environment. One of the major achievements of the project to data has been the development of the methodology for describing and analysing the complex multi-way interactions among the users and between the users and the machine. The study itself investigates a relatively coarse-grained software variable, namely whether the software is more “open” or “closed”, and uses only a small number of example programs of each type (and even that cuts across topic areas by having two of each type related to English and the other two related to Mathematics). One potential difficulty is that the variance in the interaction data associated with the different programs may be sufficiently large as to swamp any effects between the broad software types, and thus obscure one of the effects that we wish to examine. That, however, remains to be seen. Nevertheless, the method we have developed for coding and analysing interactions around the computer could readily be applied to the study of altogether more fine-grained software variables, opening up a fascinating range of possibilities for the future. Acknowledgement-The

research described in this paper is funded by ESRC Award ROO0231481. REFERENCES

I. Crook C., Computers in the classroom: defining a social context. In Computers, Cognition and Development (Edited bv Rutkowska J. C. and Crook C.). DD. 35-53. Wilev. Chichester (1987). 2. Ljght P., Foot T., Colbourn C. and’I%Lelland I., Collaborative interactions at the microcomputer keyboard. Educ. Psychol. 7, 13-21 (1987).

3. McAteer E., Demissie A. and Anderson A., Microcomputer use in Glasgow schools: a survey. Manuscript in preparation. 4. Ridgway J., Research needs in CAL. J. Comp. Assist. Learn. 2, 131-140 (1986). 5. Kulik J. A.. Bangert R. L. and Williams G. W.. Effect of computer-based teaching on secondary students. J. Educ. Psychol. 77, 668T677 (1983).

6. Mivake N.. Constructive interaction and the iterative nrocess of understandine. Con. Sci. 10. 151-177 (1986). 7. Draper S. W. and Anderson A., The significance of dialogue in learning and observing yearning. komp. Ed&. 17,933107 (1991). 8. Collis G. M., Classes of dialogue theory for the learning process: a commentary. Computers Educ. 17, 25-27 (1991). 9. Fish M. C. and Feldman S. C., Teacher and student verbal behavior in microcomputer classes: an observational study. J. Class. Inreruct. 23, 15-21 (1987). 10. Lawler R. W., Computer Experience and Cognitive Development: A Child’s Learning in a Computer Culture. Wiley, Chichester (1985). Il. Kemmis S., Atkin R. and Wright E., The evaluation of student learning. In Educational Computing (Edited by Scanlon E. and O’Shea T.). Wiley, Chichester (1987).