“But is it fair?”: An exploratory study of student perceptions of the consequential validity of assessment

“But is it fair?”: An exploratory study of student perceptions of the consequential validity of assessment

Studies in Educational Evaluation, Vol, 23, No. 4, pp. 3 4 9 - 3 7 1 , 1997 © 1997 Elsevier Science Ltd Printed ,n Great Britain. All rights reserved ...

1MB Sizes 0 Downloads 62 Views

Studies in Educational Evaluation, Vol, 23, No. 4, pp. 3 4 9 - 3 7 1 , 1997 © 1997 Elsevier Science Ltd Printed ,n Great Britain. All rights reserved 0 1 9 1 - 4 9 1 X / 9 7 $17,00 + 0.00

Pergamon

S0191-491X(7)00022-9

"BUT IS IT FAIR?" : AN EXPLORATORY STUDY OF STUDENT P E R C E P T I O N S OF THE C O N S E Q U E N T I A L VALIDITY OF A S S E S S M E N T Kay Sambell, Liz McDowell and Sally Brown University of Northumbria, Centre for Advances in Higher Education, Newcastle upon Tyne, UK

Introduction In the "Impact of Assessment" project, we have been studying the effects of alternative ways of assessing student learning through a number of case studies of assessment in practice. Amongst the criteria used to evaluate methods of assessing student learning, validity and reliability are normally considered to be of key importance, though there are many others such as feasibility, acceptability and intelligibility. The concept of validity has been extended in recent years (Messick, 1989, 1995) to include aspects such as the effects of assessment or testing on the teaching and learning context and the social consequences of the use of assessment information. In this article we focus on such aspects of validity by drawing on qualitative data to illuminate the impacts of assessment practices on student perceptions of learning and on their learning behaviour. Assessment and Validity Most detailed discussions of assessment draw in some way upon the notion of validity ( Brown & Knight, 1994; Erwin, 1991; Heywood, 1989; Linn, 1989; Rowntree, 1987). At a general level the definition of validity is unproblematic: at root it is universally used to convey the primary meaning of "measuring what you set out to measure". This definition is deceptively simple. As Hammersley (1987) recognises, when we look in more detail at discussions of validity, we do not find a clear set of definitions but a confusing diversity of ideas. More than twenty years ago, Rippere (1974) observed,

349

350

Sambell

in the context of educational testing at university level, that it is generally safer to talk of the existence of multiple validities, rather than assuming that validity is a single, unitary quality. Furthermore, the nature of this quality is unlikely to be a matter of such common agreement or so self-evident that terms need not be defined. One of the dimensions of validity which has come much more to the fore in recent years is the extent to which testing and assessment have positive or negative impacts on learning and teaching processes and learning achievements. In the North American context, the work of Frederiksen (1984) was particularly influential. He claimed that many large-scale testing programmes led to a damaging narrowing in what was taught due to a concentration on meeting test requirements. He suggested that "An important task for educators and psychologists is to develop instruments that will better reflect the whole domain of educational goals and to find ways to use them in improving the educational process" (p. 201). He therefore indicated that testing can influence both the content and processes of learning and teaching. Frederiksen and Collins (1989) introduced the term systematic validity which requires consideration of: "evolutions in the form and content of instruction and students' learning engendered by use of the test" (p. 28). Although originating in work that criticised the effects of testing, the notion of systematic validity reminds us that the effects of testing can be diverse, including both positive and negative effects. Some authors have hoped that assessment can play a beneficial and positive role in learning so that it would no longer be regarded as at all blameworthy to "teach to the test" (Wiggins, 1989). A currently widespread view amongst assessment specialists is that "appropriately used educational tests are potent tools which enhance the instructional process" (Dochy, Moerkerke, & Martens, 1996, p. 324). Much of the work undertaken in North American and some European contexts (Moerkerke, 1996) has focused on examining assessment which is conventional in those contexts, chiefly multiple choice tests, and then on alternatives. Other traditions of research into student learning, particularly research deriving from the phenomenographic approach (Marton, 1981) have also emphasised the influence of assessment on learning (Marton, Hounsell, & Entwistle, 1984). Entwistle and Entwistle (1991), investigated students' experiences of unseen essay-question exams and found worrying "the way in which the examination distorted the efforts of the students to achieve personal understanding and, secondly, the limited extent to which some types of examination question actually tapped conceptual understanding" (p. 224). According to Boud (1990), in many cases: "assessment tasks are set which encourage a narrow, instrumental approach to learning that emphasises the reproduction of what is presented, at the expense of critical thinking, deep understanding and independent activity" (p.104). Findings of a number of studies have been drawn together to demonstrate that assessment which is perceived as threatening and which provokes anxiety may push students towards a surface approach to learning which does not lead to a development of depth in understanding (Gibbs, 1992; Ramsden, 1992). These insights have led to the introduction of new terminology to describe the impacts of assessment, encapsulating ideas similar to those termed systematic validity by Frederiksen and Collins (1989). Biggs (1996) directs attention to what he calls the backwash effects of assessment on learning. Boud (1995) uses the term consequential validity to refer to this broader effect of assessment on learning and other educational matters.

Consequential Validity

351

Alternatives in Assessment The repertoire of assessment methods in use in Higher Education has expanded considerably in recent years. Birenbaum (1996) lists the most common terms which are used under the broad notion of alternative assessment. These include: "performance assessment, authentic assessment, direct assessment, constructive assessment, incidental assessment, informal assessment~ balanced assessment, curriculum-embedded assessment, curriculum-based assessment" (p. 3). These terms denote forms of assessment which differ from the conventional such as multiple choice testing and, in the context of the UK and some other higher educational systems such as those in Australia and Hong Kong, essayquestion exams and continuous assessment via essays and scientific reports. Alternative means of assessment include new forms of examinations such as openbook exams (Feller, 1994; Krarup, Naeraa, & Olsen, 1974; Theophilides, 1996) and takeaway exams (Weber, McBee, & Krebs, 1983). Other alternative forms of assessment include: projects and investigations (Allison & Benson, 1983; Hirst & Shiu, 1995; Winn, 1995); varied writing assignments (Sarig, 1996; Young & Fulwiler, 1986); oral assessment (Hammar, Forsberg, & Loftas, 1995; Hughes & Large, 1993); realistic or problem-solving tasks (Hammar et al., 1995; Segers, 1996); simulations (Smit & Van der Molen, 1996); portfolios (Birenbaum, 1996; Larsen, 1991; Valeri-Gold, Olson, & Deming, 1991); profiles (Assiter, Fenwick, & Nixon, 1992; Assiter & Shaw, 1993; Broadfoot, 1990); group assignments (Arnold, O'Conell, & Meudell, 1994; Thorley & Gregory, 1994; Winstanley, 1994); self, peer and co-assessment (Falchikov & Boud, 1989; Oscarson, 1989; Shechtman & Godfried, 1993; Stefani, 1994.) There is a growing number of texts giving overviews of a full range of alternative assessment methods (see Birenbaum & Dochy, 1996; Brown, Bull, & Pendlebury, 1997; Cross & Angelo, 1988; Hounsell, McCulloch, & Scott, 1996). A number of explanations may be offered for this recent growth in interest and in the practice of alternative assessment. Some developments have stemmed from influential criticisms of the impacts of conventional testing, as we have already indicated. More broadly, voices from outside academia have raised questions about assessment and stimulated change in practices. This can be illustrated by the so-called graduateness debate in the UK (Higher Education Quality Council, 1995) which opened up for public discussion the kinds of qualities and abilities which university graduates ought to possess and promoted scrutiny of the effectiveness of assessment in ensuring these outcome standards. In the USA, Banta, Lund, Black, and Oblander (1996), note the increasing pressures coming from students, parents, trustees, employers, local and national government for Higher Education to provide credible assessment evidence demonstrating what graduates know and can do. Birenbaum (1996) provides a cogent explanation of the underlying factors at work, emphasising the requirements placed upon educational systems in current economic and social contexts to develop "... an adaptable, thinking, autonomous person, who is a selfregulated learner, capable of communicating and cooperating with others" (p.4). She claims that this requires a move from approaches based on behaviourist theories of learning to those based on cognitive theories and, in relation to assessing learning, a parallel shift from the traditional testing culture to an assessment culture. The assessment

352

Sambell

culture, embodied in current uses of alternative assessment, favours: the integration of assessment, teaching and learning; the involvement of students as active and informed participants; assessment tasks which are authentic, meaningful and engaging; assessments which mirror realistic contexts, in contrast with the artificial time constraints and limited access to support available in conventional exams; focus on both the process and products of learning; and moves away from single test-scores towards a descriptive assessment based on a range of abilities and outcomes. The more widespread use of alternative assessment has generated new debates about how validity might appropriately be judged. Moss (1992) suggested that the use of performance tests would generate new criteria in relation to validity. Kane (1992) proposed a new approach to assessing validity in relation to alternative assessment taking an approach based on gathering evidence to support convincing and coherent arguments for validity. In some cases the need to consider the consequences of assessment has been re-emphasised. Snow (1993), for example, stressed the need to consider the evocation amongst learners of "motivational structures, effort investments, expectations for success, feelings of self-efficacy, or worries" (p.46). Nitko (1989) included the effects of assessment on student motivation for learning and the provision of helpful feedback to guide learning as two of the key features of high quality assessment. Similarly, Perkins and Blythe (1994) considered that for assessment to foster the development of understanding there should be a sharing of the criteria for assessment, the provision of feedback and that students should engage in reflection on their learning. The notion of consequential validity, that is the effects of assessment on learning and teaching, applies as strongly to alternative assessment as to conventional assessment. It needs perhaps to apply even more strongly since alternative assessment may appear, almost by definition, to have higher levels of consequential validity through its claims to incorporate meaningful, engaging and authentic tasks and to involve students actively in the assessment process. The investigation of student perspectives must play a major part in illuminating these issues. This is the focus of the research described here which is a qualitative study of students' interpretations, perceptions and behaviours when experiencing forms of alternative assessment. Method The Impact of Assessment project was initiated in 1994 to investigate the impacts of alternative assessment methods on student learning, and has been particularly concerned with exploring students' perceptions of instances of assessment in practice. The project has employed case study methodology (Kenny & Grotelueschen, 1984) because it afforded as much space as possible to participants' perceptions and judgments in the description and construction of understanding and allowed us to comprehend the complexity surrounding each particular assessment context by focussing in depth and from a holistic perspective (Simons, 1996). The data-collection period, starting in March 1994, has spanned two and a half academic years, gathering data from thirteen case studies of alternative assessment methods in practice. The project selected case studies with a view to covering as wide a range of types of assessment and subject areas as possible. Table 1 indicates the kinds of assessment which were covered.

Consequential Validity

353

Table 1: Description of Case Studies Case Stud~ 1

Subject

Nature of task

Social Science

Statistical problems

2

Built Environment

3

4

5 6

7

8

9 I0 I1

Simulation of professional ~ractice task Business ~roup case Studies study on an organisational problem Social Sciences Group presentation of statistical data in the form of a poster IT/Computing Open book exam IT/Business Group project Studies to develop an IT solution to a business problem Languages Research a topic chosen from a list composed by staff Social Sciences Essay and peer assessment of essa~,s Design/History Research pro)cot Psychology Multiplechoice test Professional A series of practice studies reflective tasks

12

Engineering

Design and build group project

13

Design

Participation in individual tutorials to discuss skill development

Assessment Products Problem solutions

Produced b)' Individuals

Report

Individuals

a) Report b) Role-play c) Report on group processes Poster

Groups

Group

Grading/marking by Students, using a staff marking scheme, with sample checking by staff Students using a staff marking scheme with sample checkin$ Staff

Students and staff using criteria derived from prior discussion

Essay-type exam Individuals answers a) Demonstration of Groups product b) Report

Staff

Oral presentation

Formally by staff. Informally by students and staff during a preparation programme using criteria produced b~' staff Staff and students using criteria derived during a preparatory programme for students Staff

Individuals

a) Essay Individuals b) Peer assessment marks and comments a) Oral presentation Individuals b) Written report Test answers Individuals a) Portfolio of Individuals written responses to tasks oral presentation ba~Continuous Groups assessment of product development file b) Product desisned Student profile of Individuals skills

Staff and students using staff-produced criteria

Staff Staff, self and peers using criteria provided by staff Staff

No formal grading

Whilst it is not part o f the case study approach to attempt to control for the m a n y potentially relevant contextual variables, it is worth indicating some relevant variations in

354

Sambell

the case study contexts. In some cases (1,7,9,11, 13), the assessment approach was being used for the fast time in that context. Otherwise, the assessment studied was a development of an approach that had been in use for some time. In case studies 3 and 6, (and also 2 and 8, with the exception of the peer assessment element), the assessment method was considered to be fairly typical for that course. In all other cases, the assessment method was more unusual and recognised as such by staff and students. Other variables including the staff aims in implementing the assessment, how they conducted the process, how well students were informed and involved, and practical and organisational issues were investigated as part of the case study approach. Pilot Work The final research design was devised after four months of pilot work in which three cases were identified and a series of semi-structured interview schedules and observation schedules were trialled. This phase was designed to allow the relevant themes and issues to emerge from the outset of the investigation, being followed up through designing, interviewing and transcribing, to ensure that ensuing interviews would be conducted so that their meaning could be analysed in a coherent way (Kvale, 1996) Procedureand Instruments The methods for collecting data in the case studies included interviewing, both of staff and students, observation and examination of documentary evidence, but the emphasis was on semi-structured interviews with students. Students were all told that participation was confidential, and that their views would be reported thematically and anonymously to the staff concerned. They were told in advance that the interviews concerned assessment and that participation was voluntary. A staged approach to interviewing was used, so that we explored respondents' perceptions and approaches over the period of the assessment, from the initial assessment briefings at the beginning of a unit of learning to post-assessment feedback sessions. Key group interviews with students were initially conducted in the early stages, followed by progressively focussed interviews which were versions of a generic cross case semistructured interview schedule, adapted appropriately to each case. The staged interview process allowed for respondent validity to be implemented, in which initial analysis, taking the form of a process of data reduction, could be fed back to respondents as a check of their recollections of earlier interviews. This also served as a method of deepening the perceptions of the interviewers and the interviewees themselves. The semi-structured interview schedules were designed after the pilot work. They were constructed to cover certain themes but to allow students to define and use their own terminology and their own definitions of relevant issues. The aim was to encourage informants to talk freely and openly about their experiences, with interviewers providing initial stimuli (Powney & Watts, 1987) Open-ended questions were devised which were flexible, allowed for probing into the issues raised and the meanings repondents held in association with aspects of the assessment process. The interviews focussed on the particular assessment being studied, probing what students understood to be required,

Consequential Validity

355

how they were going about the tasks and what kinds of learning they believed were taking place. The style and nature of the interviews match closely with the kind of qualitative research interview described by Kvale (1996) with an emphasis on illuminating the meanings which interviewees ascribed to the situation, a focus on the specific and concrete rather than general opinions and a "conversational" style. Entwistle and Entwistle (1992) describe a similar conversational approach to interviewing and found that this "interactive form of interviewing seemed to be essential to enable students to give full expression to experiences which they seemed previously not to have considered in any systematic way" (p. 5). The group interviews, which involved more people in the conversation, provided additional insights as students engaged in debate within the group. Interviews were tape recorded, fully transcribed, indexed and coded, using qualitative data analysis software. The student interviews were complemented by other sources of evidence. Observations of key aspects of an assessment process were made (such as briefing sessions, feedback sessions, formative assessment exercises and, if feasible, the formal assessment itself). Documentary evidence was also collected (such as assessment criteria, information pertaining to assessment, student handbooks and so on). Sometimes the work students produced for assessment and the marks and feedback received were examined. In addition to these between-method forms of triangulation, we employed betweenrespondent triangulation. In each case the staff involved were also interviewed, using semi-structured interview schedules, at various appropriate stages. The act of soliciting the varying perspectives of the range of people involved in the assessment process was crucial in building up a rich, fully contextualized picture of the phenomenon - alternative assessment - under investigation. Lecturers' agreement to the case study taking place was secured and subsequently the lecturers directly involved were interviewed. Student interviewees were recruited and a first interview was conducted before or just after they had begun to work on the assessment task. In most cases we selected students with an open volunteering process; in a few cases we were able to select an initial random sample from the class list, although not all students so selected were willing to participate. Some cases (6, 12) allowed initial interviews with all the student population. In others (7,8) snowball sampling was used to follow up the contacts of initial student volunteers. Within the practical limitations of conducting the study, we attempted to identify a mix of students according to gender and age. In each case between five and twelve students were interviewed in depth on a staged basis and these were drawn from populations of between 14 and 63 students experiencing that particular assessment. Students' motivations for participation varied considerably. We recognise that we probably had an over-representation of students who were enthusiastic and interested in their courses and of those who had strongly developed views about assessment, but this was not the case for all interviewees. Since we intended to explore the variations in students' perceptions of assessment methods and what kinds of things happened when students were assessed, rather than determine what proportion of students behaved in particular ways, our sampling approach was appropriate.

356 Data Analysis

Sambell

Initial analysis was conducted at the level of the case. As outlined earlier, a process of data reduction was undertaken in order to feed back the main points derived from the interview to respondents. When the data collection had finished, summary case reports were produced (McDowell, 1996). These were circulated to the staff concerned, giving them the opportunity to inform the research process if they felt that some evidence had been misinterpreted or the research team had missed important contextual detail. Although the insights of staff, fully immersed in the whole process in a way in which the researchers were unable to fully participate, could be extremely useful, nevertheless we reserved the right to make our own judgments and interpretations. Individual case analysis was followed by cross case analysis. Multiple interpreter control of the analysis was used, so that when different meanings were found by analysts they could be worked together into a dialogue leading to an intersubjective agreement (Kvale, 1996). The use of multiple analysts also led to an enrichment of the analysis by including multiple perspectives, with ensuing discussions about interpretations leading to a conceptual clarification and refinement of the issues in question. Interview data were coded into categories of meaning which emerged from the data (Miles and Huberman, 1994), allowing us to explore the relationships between cases and to increase the generalisability of the context-bound and specific nature of the single case study. It is important to emphasise that we were not setting out to compare students' views of traditional and alternative assessments or asking them which was "better". Interviews concentrated upon students' perceptions of the particular example of assessment we were there to study. In the early stages of pilot work, however, it soon became apparent that the students themselves naturally made comparisons between their perceptions of "normal" or traditional assessment, to illustrate the points they wished to make about the example we were there to study. In this way, the resulting data extensively revealed students' views of traditional assessment. The comparisons that students regularly drew displayed stark contrasts in their perceptions of traditional and alternative assessment. Furthermore, analysis revealed that a substantial number of the points that students repeatedly made about the perceived differences between traditional and alternative assessment related to various dimensions of validity. The emergent issues and themes are presented below to highlight the contrast in students' perceptions when considering various forms of assessment. It is necessary, however, to acknowledge that students' opinions of traditional assessment mechanisms were to a large extent general and impressionistic and tended to express stereotypical ideas and assumptions, as opposed to the sharp focus provided by the concrete examples of alternative assessment being specifically studied in each case. We regard these assumptions as important, however, because of the extent to which these stereotypes demonstrably pervade the thinking of the majority of students within our sample when they begin to discuss their experience of assessment, and because of the unanimity of opinion they represent.

Consequential Validity Results

357

The Effects of Student Perceptions of Assessment on the Process of Learning

Student Perceptions of the Contamination of Learning by Traditional Assessment Broadly speaking, w e discovered that students often reacted very negatively when they discussed what they regarded as "normal" or traditional assessment. One of the most commonly voiced complaints focused upon the perceived impact of traditional assessment on the quality of learning achieved. Many students expressed the opinion that, from their viewpoint, normal assessment methods had a severely detrimental effect on the learning process. They frequently believed that the quality of their learning was actually polluted or contaminated, because they set out, quite consciously, to achieve second-rate or "poor" learning for the purposes of a particular assessment point. This view was most markedly voiced when students were describing their idea of the traditional UK unseen examination. The extent to which students from all disciplines, years of study and levels of commitment held these beliefs should not be underestimated. In a typical quotation, a Social Science student claimed: You shallow learn for an exam, but you don't know which you quickly forget.

the stuff. It's poor learning

Another student said that he: ... made a point of flushing out key concepts and ideas to make room for the ones rll need in the next exam. This was invariably seen as a regrettable or second-rate situation by the interviewees. In their view, exams had little to do with the more challenging task of trying to make sense and understand their subject. Their comments were underpinned either by a sense of frustration or resignation, and many were openly scornful of an assessment system which appeared to them to be of such little benefit to their learning processes, and which seemed to them to emphasise and value "short-term learning that you won't remember after a couple of days." Moreover, these views were by no means confined to students who felt that their work was predominantly assessed by examinations. When describing their interpretations of the kinds of learning prompted by the requirement to write a prescribed essay, for example, most students spoke of that learning as being a ...straightforward...type of learning...purely "This is the question and this is the answer," which an essay tends to be. Many felt that this kind of learning principally involved them in trying to "get your information purely from lifting it from books". The reductive nature of their perceptions (the regularity with which they typically resorted to the words "just" or "purely")

358

Sambell

epitomises their sense of dissatisfaction. Again, they saw 'just finding a book' as a deeply unrewarding means of learning.

Integration of Learning and Assessment in Perceptions of Ahernative Assessment By contrast, when students considered new forms of assessment, their views of the educational worth of assessment changed, often quite dramatically. Alternative assessment was perceived to enable, rather than pollute, the quality of learning achieved. When describing alternative assessment, interviewees found it difficult or inappropriate to say where their learning stopped and the assessment began, so fully integrated were the two aspects in their minds. Many made the point that for alternative assessment they were channelling their efforts into trying to understand, rather than simply memorise, or routinely document, the material being studied. They generally felt that this was a much more satisfying level of learning. The following student is describing the impact of an open book exam on his revision strategy and approaches to study throughout the preceding term. I think the [open book exam] helps you learn better, personally, because you sit there and actually read the stuff, rather than just sit there and commit it to memory...I found myself questioning key concepts and ideas, rather than hard facts about things. I was trying to understand the subject rather than memorise things. The belief that being involved in alternative assessment implied a high-quality level of learning was a factor which emerged across all of the cases studied. Students particularly prized the notion that, with alternative assessment, they could, as one put it, "achieve a level of knowledge you're more likely to hang onto long term." Describing the perceived advantages of working towards an orally assessed seminar presentation, for example, one student claimed: I found [by the end of the academic year] that you always knew the subject you'd done your presentation on really well. Often, then, a dramatic contrast formed in individuals' minds concerning the learning behaviours deemed appropriate for different forms of assessment. For instance, the following quote is from a student who is contrasting his learning behaviour when he knows that he is faced with a f'mal unseen exam, with his behaviour when tackling an assessed group design-and-build project: With lectures you've got all the information, you're writing it down, and then you forget all about it [until it comes to revising for the exam.] With this you've got to put it into practice, because you're building something. So it stays in and helps you learn it. It's much better than actually sitting at the back of a lecture room, twiddling your thumbs. The students we spoke to often described this unthinking, inattentive dependence on lectures during the term and the last minute hurry to revise just before the exam, or the

Consequential Validity

359

rushed preparation of an essay the night before it was due in. By contrast, they often associated the idea of alternative assessment with being "more difficult" and requiring consistent application and effort. Yet although all the students interviewed felt that alternative assessment implied a high-quality level of learning, some recognised that there was, for them, a gap between their perception of the type of learning being demanded and their own action. Several claimed they simply did not have the time to invest in this level of learning and some freely admitted they did not have the personal motivation, as seen in the following quotation: At first I thought this [alternative assessment] was brilliant. But you need a lot of motivation...The thing is that half of us lot just want to get out...it's just a case of finishing the course and passing. It remains the case, however, that most students felt that alternative assessment related to a more exacting level of learning and preparation than that implied by traditional methods. For many, this was not only extremely rewarding in personal terms due to the perceived nature of the intellectual challenge being posed, but had perceived long-term benefits, in that they could see the point of this level of learning because it developed knowledge and skills which would be of value in later life, both inside and outside academia. Perceptions of Authenticity in Assessment

The Artificial Nature of Traditional Assessment Many students perceived traditional assessment tasks as arbitrary and irrelevant. This did not, in their view, make for effective learning, because they only aimed to learn for the purposes of the particular assessment point, with no intention of maintaining the knowledge acquired in any long-term way: "You think: 'Just let me remember this for the next hour and a half Then you don't care." The following illustration is drawn from a student who was contrasting his view of the "realistic" nature of assessment by oral presentation of a research paper, which he felt was meaningful and relevant, with his impression of "normal" assessment and the type of learning typically required: Everything else we do here, like essays and so on, are just not relevant to real life. Exams here are so pointless, the questions are so precise. You're never going to need to know that kind of useless information. Many other students similarly saw normal assessment as something they did because they had to, not because it was interesting or meaningful in any other sense than it allowed them to accrue marks. To many it was an encumbrance, a necessary evil, and an unfair means of assessment which had everything to do with certification and which was, in their minds, divorced from the learning they felt they had achieved whilst studying the subject being tested.

360

Sambell

When students described normal assessment activities, it was in terms of routine, dull, artificial behaviour. This view is encapsulated by an Arts undergraduate explaining the way in which she typically conducts research in order to write an assessed essay: Often with essays all you have to do is go to the hbrary, look up the relevant books, and just copy down the relevant chapters in a different language. Many expressed the view that exams were especially unfair, because what was being measured was so alien to any "normal" ways in which they might be expected to work in future: I don't think an exam can ever be relevant to a work situation. I mean, you never will be asked to sit down and write everything you know about a topic in 45 minutes, will you? ...In the workplace you say 'Tm sorry, I haven't a clue, but I know where to find out." In this light, traditional assessment tasks were perceived to pose unrealistic and pointless demands, This led many to believe that traditional assessment was inappropriate as a measure, because it appeared, from students' point of view, simply to measure your memory or, in the case of traditional essay-writing tasks, to measure your ability to marshal lists of facts and details. Many claimed that exam success depended upon whether you happened to have a good memory and could remember facts to "regurgitate". This view was the most widespread across all the student population interviewed, and most strongly associated in students' minds with the notion of the unseen examination. It is noteworthy that students across the sample felt that this was unsatisfactory, unjust, or unfair. A comment epitomising this position is: ... exams don't say whether you're a good student or a bad student, they just say whether you're having a hard time remembering the material or an easy time remembering. To most this seemed unfair, not only because it failed to represent an appropriate measure of learning, but because it was perceived to advantage what they thought were relatively unimportant qualities: In exams if you have a decent memory you have a 100% advantage over the guy who doesn't have such a good memory.

The Authentic Nature of Alternative Assessment Students repeatedly voiced the belief that the example of alternative assessment under scrutiny was fairer than traditional assessment, because, by contrast, it appeared to measure qualities, skills and competences which would be valuable in contexts other than the immediate context of assessment. Students were able to see the point in "real" terms, which prompted them to see this form of assessment as being meaningful and worthwhile.

Consequential Validity

361

In some of the cases we studied, the assessment method's novelty lay in the lecturer's attempt to produce an activity which would simulate a real life (often vocational) context, so that the students would clearly perceive the relevance of their academic work to broader situations outside academia. This strategy was effective, and the students involved highly valued these more authentic ways of working. In this sense, alternative assessment was often conceived by students as being fair, because the assessment tasks seemed pertinent to the everyday world (of work, for example.) Here, for instance, is an engineering student describing his perception of the value of alternative assessment: This is more like an actual work situation. You're given a task to do and it's up to you to just get on and do it...I think you need to work in a group...when you're out in the real world you work in teams, you all put your brains together to get the thing done. Comments of this nature were not confined to highly vocational courses. Many students perceived a novel assessment task as being relevant or authentic because of the emphasis it placed upon what they regarded as valuable transferable skills. For example, it was common for students involved in self and peer assessment to claim that they were developing the valuable skill of being able to reflect upon, and make reasoned judgments about, their own work and that of others. Students also claimed that they were clearly able to see the point of the new assessment method in terms of enhancing a form of longstanding or enduring learning. Many perceived alternative assessment's 'relevance' to lie in the broader approach to study which foregrounded the importance of the process of critical inquiry and analysis: It's something you can apply to anything. I mean, we're looking at it from [this specific topic's] point of view. But it's something you can apply to anything you were researching at all. [What we've studied] is just a theme, really, isn't it? These were often regarded as high-order skills which, from the student viewpoint, traditional assessment often failed to reward. This belief prompted such comments as: "I prefer this, this is more a true test of understanding." Many students were similarly disposed to place a premium upon the marks awarded for alternative assessment, because they felt that it enabled them to show the extent of their learning, and allowed them to articulate more effectively precisely what they had digested throughout the learning programme. Here again is the student from whom we heard earlier, when he was describing his efforts to consciously achieve 'poor learning' when faced with exams. He is viewing assessment positively now, seeing it as a 'chance' and a positive opportunity: I think [alternative assessment] gives a much better indication of what you know. It gives you much more chance to express your ideas. You can actually put more into your work and promote yourself into your work. You can certainly demonstrate how much more you know, or how you can interpret things.

362

Sambell

These perceptions were not viewed in a positive light by all students, however. Some felt uncomfortable with the exacting demands they felt they were facing, as illustrated by the ironic coda one learner added to her observations: I think it tests you better, because it's not just testing your memory, it's testing your knowledge of the subject. It's all about...being able to interpret and put your own_point of view. It's a bit unfortunate, really, isn't it?! For many students, however, alternative assessment seemed to represent a much more equitable and just state of affairs.

Student Perceptions of the Fairness of Assessment Throughout our interview data, one of the key ways in which students evaluate various assessment techniques is to ask whether they are "fair" or "unfair". The issue of fairness, from the student perspective, is a fundamental aspect of assessment, the crucial importance of which is often overlooked or oversimplified from the staff perspective. To students, the concept of fairness frequently embraces more than simply the possibility (or not) of cheating: it is an extremely complex and sophisticated concept which students use to articulate their perceptions of the worth of an assessment mechanism, and it relates closely to our notions of validity.

Perceptions of Traditional Assessment as an Inaccurate Measure of Learning Students repeatedly expressed the view that traditional assessment is an inaccurate measure of learning. Many, for example, made the point that end-point summative assessments, particularly examinations which took place only on one day, were actually considerably down to luck (see also Tang, 1994) rather than accurately assessing present performance. Many students complained that success in exams depended on factors such as whether you were ill on the day; whether you had a tendency to panic; or whether you suffered extreme levels of stress. Students consistently observed, for example Exams test your ability to pass exams, rather than your knowledge of the subject The results you get are not just dependent on what you know, but on how good you are at doing exams. Often students expressed concern that it was too easy to leave out large portions of the course material, when writing essays or taking exams, and still do well in terms of marks. What often mattered was whether you had fortuitously decided to revise the right topics (the ones which appeared on the exam paper). This, they felt, was unfair, because what was being measured was not an accurate reflection of their ability: In normal exams you learn a whole essay and it doesn't come up, so it seems to the examiners that you know nothing.

Consequential Validity

363

Given the preponderance of students who expressed these views of traditional assessment, many clearly felt quite unable to exercise any degree of control within the context of the assessment of their own learning. This led them to the belief that assessment was something that was done to them, rather than something in which they could play an active role. In some cases this view was so extreme that they expressed the belief that what exams actually measured was the quality of their lecturer's notes and handouts, which, of course, students felt was extremely unfair. Other reservations that students blanketed under the banner of "unfairness" included whether you happened to be "good at writing essays", or whether you were fortunate enough to have had a lot of practice in any particular assessment technique in comparison with your peers. For example, one mature student claimed that exams unfairly advantaged recent school-leavers because: If you come from a working background, exams are not a lot of use to you...They suit people who come straight from school, who've been taught to do exams and nothing else. I'm out of that routine. These types of comments show that a high proportion of students were concerned that traditional assessment mechanisms did not act as accurate indicators of, say, their conceptual grasp of a topic.

Perceptions of Alternative Assessment as an Accurate Measure of Learning When discussing alternative assessment many students believed that success more fairly depended on consistent application and hard work, not a last minute burst of effort or sheer luck. You can show you actually know the subject and you understand things. You usually get reward for the effort you've put in. One of the key findings to emerge from our research is the extent to which students use the concept of fairness to talk about whether, from their viewpoint, the assessment method in question rewards, that is, looks like it is going to attach marks to, the time and effort they have invested in what they perceive to be meaningful learning. Here, for example, is a student outlining the advantages of what he termed "progressive assessment" (by which he meant continuously assessed course work). Progressive assessment is beneficial because it tests your abilities far more than your remembered knowledge, which, well, of course your abilities are actually far more important. Your abilities to research, analyse, dissect an argument. Your abilities to bring forward information out of a set of data: those kind of analytical abilities I think are important in day to day life, and therefore they are going to come across far more in progressive assessment than they ever are in exams, and I think that's good.

364

Sambell

Furthermore, many felt that alternative assessment was fair because it was perceived as rewarding those who consistently make the effort to learn, rather than those who rely on "cramming" or a last-minute effort. The following student typically illustrates her perceptions of the value of alternative assessment by raising the negative example of traditional assessment: I think there should be more [alternative assessment] because you're going to the classes, you're getting involved in them, taking things out of them. But then, at the end, say if you have a bad exam, you get no credit. It's getting much better now [with the introduction of novel assessment]. In addition, the students often claimed that alternative assessment represents a marked improvement: firstly, in terms of the quality of the feedback students expected to receive; and secondly in terms of successfully communicating staff expectations. Many felt that openness and clarity were fundamental requirements of a fair and valid assessment system. For example, student involvement in self and/or peer assessment featured in many of the cases to some degree, and most perceived the benefits of becoming more heavily involved in an attempt to understand assessment criteria and to relate them to their own work: It was generally quite good to go away and think to yourself "Well, I should do this in my essay, I shouldn't do that." From the student viewpoint this was not regarded as spoon-feeding, but purely a matter of being fair, by letting students know how they might best direct their efforts. On this level, feedback, as vital information, powerfully affected the way students approached future assessment points, which the following student illustrates. She is describing how her experience of peer assessment had impacted upon the way she tackled her next assessed essay. I thought about it a lot more. If I do an essay I normally....I just tend to do it. [Having the criteria] was good, because I looked at them a lot, thinking, 'Now, have I answered all the questions?' [addressed all the criteria] From the student viewpoint, the clarity and openness of such assessment was perceived as an issue of control, affording them a measure of independence by equipping them with sufficient information to be able to pass judgments on their own work and takes steps to improve it, rather than relying exclusively on staff to perform this function on their behalf. This is not to say, however, that alternative assessment was perceived as a universal panacea to the ills of traditional assessment mechanisms, and students did express serious reservations about its implementation at times. There were some concerns about the reliability of self and peer assessment, even though students valued the activity. In other cases, students were concerned about the extent to which oral assessment actually measured an individual's confidence, or advantaged those who were "naturally" articulate:

Consequential Validity

365

No matter how well you've researched, if you present it badly you've had it. And then even if you've done minimal research, but you present it really well, you can make it sound very good. In this way, students occasionally raised qualms about the nature of what was being measured by alternative assessment. It is fair to say, however, that nearly all students valued the introduction of novel mechanisms, because they saw it as a broadening of the assessment process: There's got to be a bit of everything to give everyone the chance to shine. Discussion The research has revealed that students experiencing a variety of types of alternative assessment did perceive many of the positive features considered by educationalists to be amongst the benefits of such alternative approaches, particularly in relation to the impacts of assessment on learning or its consequential validity. Their perceptions of poor learning, lack of control, arbitrary and irrelevant tasks in relation to traditional assessment contrasted sharply with perceptions of high quality learning, active student participation, feedback opportunities and meaningful tasks in relation to alternative assessment. Students also judged assessment on the basis of ideas which were similar to those of assessment specialists, although of course students used different terminology. Students did not employ the word "validity" when they talked about assessment; the word they were most likely to use was "fairness". However, their concept of fairness was clearly related to educational ideas about validity. Lecturers have a tendency to dismiss students' complaints about a lack of fairness, perhaps believing that they are asking for assessment systems which are in some sense easy or undemanding. In fact, the students we interviewed consistently used the concept of fairness to describe assessment systems which are, from their point of view, genuinely valid measurements of what they deem to be meaningful and worthwhile learning. This bears out the findings of other related research, such as that conducted by Nicholls and Smith (1996) and Kniveton (1996). Often students' experience of alternative assessment had led them to consider that the novel systems were, in principle if not always in practice, somehow fairer than those to which they had become accustomed. So, although our investigation started by collecting data from the point of view of students' perceptions of alternative assessment, it in fact reveals a range of criteria which students apply, either positively or negatively, to any assessment they encounter. From the students' point of view, assessment has a positive effect on their learning and is fair when it: Relates to authentic tasks Represents reasonable demands Encourages students to apply knowledge to realistic contexts Emphasises the need to develop a range of skills Is perceived to have long-term benefits

366

• • • • •

Sambell

Rewards genuine effort, rather than measuring 'luck' Rewards breadth and depth in learning Fosters student independence by making expectations and criteria clear Provides adequate feedback about students' progression Accurately measures complex skills and qualities, as opposed to an over-reliance on memory or regurgitation of facts

The striking comparisons students drew between conventional and alternative assessment mechanisms suggest that an effective way to change student learning behaviour is to demonstrably alter the method of assessment. The "idea of the exam" exerts a powerful hold over students' minds when they come to consider assessment. Even if their stereotyped ideas about exams are inappropriate (and many lecturers would argue that students have very inaccurate perceptions of exams and what they measure), it is extremely difficult to dislodge these ideas. Students believe that traditional assessment contaminates their learning, and this has a dramatic potential impact on their learning behaviours. The "normal" assessment approach appears to them to legitimise poor learning. The strict separation, in the student's mind, of assessment and learning helps to fuel this belief, because assessment is seen predominantly as a summative tool, and measurement is something which happens after learning, predominantly, if not exclusively, for the purposes of certification. By contrast, students were very positive about the effects of alternative assessment on their learning. We recognise that to some extent students were probably responding to the novelty of the situation in which they found themselves, and were aware that they were being treated in a special way (both by lecturers introducing a new form of assessment, and by the research team, whose declared interest in students' views may also have resulted in a kind of "Hawthorne effect"). Nevertheless interviews consistently revealed that the idea of novel assessments, like the "idea of the exam", exerted a powerful effect on students' views of what is required when it comes to assessment, and hence on the kinds of learning behaviour deemed appropriate. The positive nature of these views, especially the high levels of consequential validity believed to inhere in alternative assessment, suggests that it may be a powerful tool in helping to shape or change students' approaches to learning. However, there is a difference between ideas and action. Students generally believed that their learning had been enhanced under conditions of alternative assessment but further analysis of our data and further research in other contexts would be required support a claim that either the processes or the outcomes of their learning had genuinely improved. For example, in one of the cases studied, all the interviewees realised that understanding was required in order to do well in terms of the new assessment regime. Many, however, did not know how to achieve this level of understanding, or tried to achieve it and did not succeed. In this and in other cases, students experienced time pressures, conflicts of interest from other pieces of assessed work, or from domestic pressures, and their good intentions were not carried out in practice. Others lacked the motivation to engage in real learning. Sometimes problems within the assessment task itself, such as difficulties in group work, diminished their performance and eroded their motivation (McDowell, 1995).

Consequential Validity

367

In other cases, practical or organisational issues caused problems. There were worries about the possibility of (other students) cheating or plagiarising. Peer and selfassessment required particularly careful preparation and support for students who were often initially worried about passing judgments on their friends. Some felt threatened or unnerved by their insights into the apparent subjectivity of assessment, or failed to develop confidence in their ability to act fairly as an assessor themselves. The procedures for alternative assessment need to be rigorous in order to allay such student fears, and to ensure that as close a match as possible is achieved between their views of the educational worth of the assessment mechanism and the reality of the situation they experience. However, as we have already suggested, the assumption that traditional forms of assessment represent unequivocally valid assessment mechanisms should be carefully scrutinised. In addition, it should be recognised that the introduction of alternative assessment actually provokes students to consider and interrogate the nature and purpose of all forms of assessment. In this way it may well serve to raise student expectations, some of which cannot be realistically fulfilled. We must acknowledge that, whilst alternative assessment does not hold all the answers, it does have the potential to encourage and reward genuine learning achievements. References Allison, J., & Benson, F.A. (1983). Undergraduate projects and their assessment. Institute of Electrical Engineers Proceedings, 130 (8), 402-419.

Arnold, P., O'Conell, C., & Meudell, P. (1994). A practical experiment. The New Academic, 3, (2), 4-5. Assiter, A., Fenwick, A., & Nixon, N. (1992). Profiling in higher education: Guidelines for the development and use of profiling schemes. London:HMSO/CNAA. Assiter, A., & Shaw, E. (Eds.) (1993). Using records of achievement in higher education. London: Kogan Page, Banta, T.W., Lund, J.P., Black, K.E., & Oblander, F.W. (1996). Assessment in practice: Putting principles to work on college campuses. San Francisco: Jossey-Bass.

Biggs, J. (1996). Assessing learning quality: Reconciling institutional, staff and educational demands. Assessment & Evaluation in Higher Education, 21 (1), 5-15. Birenbaum, M. (1996). Assessment 2000: Towards a pluralistic approach to assessment. In M. Birenbaum & F.J.R.C. Dochy (Eds.), Alternatives in assessment of achievements, learning processes and prior knowledge (pp. 3-29). Dordrecht: Kluwer. Birenbaum, M., & Dochy, F.J.R.C. (Eds.) (1996). Alternatives in assessment of achievements, Dordrecht: Kluwer.

learning processes and prior knowledge.

Boud, D. (1990). Assessment and the promotion of academic values. Studies in Higher Education, 15, (1) 101-Ill.

368

Sambell

Boud, D. (1995). Assessment and learning: Contradictory or complementary? In P. Knight (Ed.), Assessment for learning in higher education (pp. 35-48). London: Kogan Page. Broadfoot, P. (1990). Personal development through profiling: A critique. Western European Education, 22 (1), 48-66. Brown, G., Bull, J., & Pendlebury, M. (1997). Assessing student learning in higher education. London: Routledge. Brown, S., & Knight, P. (1994). Assessing learners in higher education. London: Kogan Page. Cross, P.K., & Angelo, T.A. (1988). Classroom assessment techniques: A handbook for faculty. Michigan: National Centre for Research to Improve Post-secondary Teaching and Learning, Ann Arbor. Dochy, F.J.R.C., Moerkerke, G., & Martens, R., (1996). Integrating assessment, learning and instruction: Assessment of domain'specific and domain-transcending prior knowledge and progress. Studies in Educational Evaluation, 22, (4), 309-339. Entwistle, N.J., & Entwistle, A. (1991). Contrasting forms of understanding for degree examinations: The student experience and its implications. Higher Education, 22, 205-227. Entwistle, A., & Entwistle, N. (1992). Experiences of understanding in revising for degree examinations. Learning and Instruction, 2, 1-22. Erwin, T.D. (1991). Assessing student learning and development. San Francisco: Jossey-Bass. Falchikov, N., & Boud, D. (1989). Student self assessment in higher education: A meta-analysis. Review of Educational Research, 59 (4), 395-430. Feller, M. (1994). Open-book testing and education for the future. Studies in Educational Evaluation, 20, 235-238. Frederiksen, J. R., & Collins, A. (1989). A systems approach to educational testing. Educational Researcher, 18 (9), 27-32. Frederiksen, N. (1984). The real test bias: Influences of testing on teaching and learning. American Psychologist, 39, (3), 193-202. Gibbs, G. (1992). Improving the quality of student learning through course design. In R. Barnett (Ed.), Learning to effect (pp. 149-165). Milton Keynes UK: SRHE & Open University Press. Hammar, M.L., Forsberg, P.M.W., & Loftas, P.I. (1995). An innovative examination ending the medical curriculum. Medical Education, 29, 452-457. Hammersley, M. (1987). Some notes on Educational Research Journal, 13 (1), 73-81.

on the terms 'validity' and 'reliability'. British

Heywood, J. (1989). Assessment in higher education (2nd edition). New York: Wiley.

Consequential Validity

369

Higher Education Quality Council (1995). The graduate standards programme:Interim report HEQC, London. Hirst, K., & Shiu, C. (1995). Investigations in pure mathematics: A constructivist perspective. Hiroshima Journal of Mathematics Education, 3, 1-14. Hounsell, D., McCulloch, M., & Scott, M. (Eds.) (1996). The ASSHE inventory: Changing assessment practices in Scottish higher education. Edinburgh: The University of Edinburgh, Napier University & the Universities and Colleges Staff Development Agency. Hughes, I., & Large, B. (1993). Assessment of students' oral communication skills by staff and peer groups. The New Academic, 2 (3), 10-12. Kane, M.T. (1992). An argument-based approach to validity. Psychological Bulletin, 112 (3), 527-535. Kenny, W.R., & Grotelueschen, A.D. (1984). Making the case for case study. Journal of Curriculum Studies, 16 (1), 37-51. Kniveton, B. (1996). Student perceptions of assessment methods. Assessment and Evaluation in Higher Education, 21 (3), 229-237. Krarup, N., Naeraa, N., & Olsen, C. (1974). Open-book tests in a university course. Higher Education, 3, 157-164. Kvale S. (1996). InterViews: An introduction to qualitative research interviewing. Thousand Oaks, Sage.

London:

Larsen, R.L. (1991). Using portfolios in the assessment of writing in the academic disciplines. In P. Belanoff & M. Dickson (Eds.), Portfolios: Process and product. Portsmouth, NH: Boynton/Cook. Linn, R.L. (1989). Educational measurement. (3rd Ed.). New York: Macmillan. Marton, F. (1981). Phenomenography - describing conceptions of the world around us. Instructional Science, 10, 177-200. Marton F., Hounsell D., & Entwistle N. (Eds.) (1984)• The experience of learning. Edinburgh: Scottish Academic Press. McDowell, L. (1995). The impact of innovative assessment on student learning. Innovations in Education and Training International, 32 (4). 302-313. McDowell, L. (1996). A different kind of R&D? Combining educational research and educational development. In G Gibbs (Ed.), Improving student learning: Using research to improve student learning (pp. 137-145). Oxford: Oxford Centre for Staff Development. Messick, S. (1989). Meaning and values in test validation: the science and ethics of assessment. Educational Researcher, 18, 5-11. Messick, S. (1995). Validity of psychological assessment. American Psychologist, 50 (9), 741749.

370

Sambell

Miles, M., & Huberman, M. (1984). Qualitative data analysis: A sourcebook of methods. Beverley Hills, CA: Sage. Moerkerke, G. (1996). Assessment for flexible learning: Performance assessment, prior knowledge state assessment and progress assessment as new tools. Unpublished Ph.D. thesis. Heerlen: Open University of the Netherlands. Moss, P.A. (1992). Shifting concepts of validity in educational measurement: implications for performance assessment. Review of Educational Research, 62 (3), 229-258. Nicholls, T., & Smith, P. (1996). Assessment as communication in the learning of students and tutors. UCoSDA Briefing Paper Twenty-Nine. Sheffield, UK: UCoSDA. Nitko, A.J. (1989). Designing tests that are integrated with instruction. In R.L. Linn (Ed.), Educational measurement (3rd Ed., pp. 447-474). New York: Macmillan. Oscarson, M. (1989). Self assessment of language proficiency: Rationale and applications. Language Testing, 6, 1-13. Powney, J., & Watts, J. (1987). Interviewing in educational research. London: Routledge & Kegan Paul. Perkins D.N. & Blythe, T. (1994). Putting understanding up front. Educational Leadership, 52 (5), 4-7. Rarnsden, P. (1992). Learning to teach in higher education. London: Routledge. Rippere, V.L. (1974). On the 'validity' of university examinations: Some comments on the language of the debate. Universities Quarterly, Spring 1974,209-218. Rowntree, D. (1987). Assessing students: How shall we know them? (2nd ed.) London: Kogan Page. Sarig, G. (1996). Assessment of academic literacy. In M. Birenbaum & F.J.R.C. Dochy (Eds,) Alternatives in assessment of achievements, learning processes and prior knowledge (pp. 161-199). D0rdrecht: Kluwer. Segers, M.S.R. (1996). Assessment in a problem-based economics curriculum. In M. Birenbaum & F.J.R.C. Dochy (Eds,) Alternatives in assessment of achievements, learning processes and prior knowledge (pp. 201-224). Dordrecht: Kluwer. Shechtman, Z., & Godfried, L. (1993). Assessing the performance and personal traits of teacher education students by a group assessment procedure: A study of concurrent and construct validity. Journal of Teacher Education, 44 (2), 130-138 Simons, H. (1996). The paradox of case study. Cambridge Journal of Education, 26 (2), 229335. Smit, G.N., & Van der Molen, H.T. (1996). Simulations for the assessment of counselling skills. Assessment and Evaluation in Higher Education, 21 (4), 335-345

Consequential Validity

371

Stefani, L. (1994). Peer, self and tutor assessment: Relative reliabilities. Studies in Higher Education, 19 (1), 69-75. Snow, R.E. (1993). Construct validity and constructed-response tests. In R.E. Bennett, & W.C. Ward (Eds.), Construction versus choice in cognitive measurement (pp. 45-60). Hillsdale, NJ: Erlbaum. Tang, C. (1994). Effects of modes of assessment on students' preparation strategies. Gibbs (Ed.), Improving student learning: Theory and practice (pp. 151-170). Oxford: OCSD.

In G.

Theophilides, C., & Dionysiou, O. (1996). The major functions of the open-book examination at the university level: A factor analytic study. Studies in Educational Evaluation, 22 (2), 157-170. Thorley L., & Gregory, R. (1994) Using group-based learning in higher education. London: Kogan Page. Valeri-Gold, M., Olson, J.R., & Deming, M.P. (1991-2). Portfolios: Collaborative authentic assessment opportunities for college developmental learners. Journal of Reading, 35 (4), 298-305. Weber, L.J., McBee, K., & Krebs, J. E. (1983). Research in Higher Education, 18 (2), 473-483

Take home tests: An experimental study.

Wiggins, G. (1989). Teaching to the (authentic) test. Educational Leadership, 46, 41-47. Winn, S. (1995) Learning by doing: Teaching research methods through student particpation in a commissioned research project. Studies in Higher Education, 20 (2), 203-214. Winstanley, M. (1992). Group work in the humanities: History in the community, a case study. Studies in Higher Education, 17 (1), 55-65. Young, A., & Fulwiler, T. (Eds) (1986). Writing across the disciplines: Research and practice. Upper Montclair: Boynton/Cook. The Authors K A Y S A M B E L L is a Research Associate in the Centre for Advances in Higher Education, University of Northumbria, Newcastle. She has conducted qualitative research in a range of contexts including higher education and secondary education. She also lectures at the University of Northumbria and the University of York, where she gained her Ph.D. LIZ M C D O W E L L is Senior Lecturer in Educational Development in the Centre for Advances in Higher Education where she specialises in assessment and evaluation both as a researcher and as a consultant to a range of academic departments and universities. S A L L Y B R O W N has authored and edited a number of books on teaching, learning and assessment for higher education practitioners. She is particularly interested in alternative assessment. She is currently Head of the Quality Enhancement Unit at the University of Northumbria.