Studies in Educational Evaluation PERGAMON
Studies in Educational
Evaluation
28 (2002) 1X9-198 www.elsevier.comistueduc
BOOK REVIEW
EVALUATING
SERENDIPITY
Dan Gibton
School of Education,Tel Aviv Universify, lsrael
A review of: Evaluation Models: Viewpoints on Educational and Human Services Evaluation (2nd Edition). Daniel L. Stufflebeam, George F. Madaus, Thomas Kellaghan (Editors). Norwell, Massachusetts: Kluwer Academic Publishers, 2000, 509~~. ISBN O-7923-7884-9 (hardcover).
This year I was appointed to sit on a government committee that is overseeing the evaluation of an experimental total-school type change in Israel’s high schools. Generally, it includes extending teachers’ workday to a weekly total of about 40 hours (instead of about 24 now) thus allowing them to engage in program development, pastoral care of individual students, extra curricular activities and so on. The program also includes modifying some parts of school buildings so as to allow suitable workspace for teachers (as opposed to the large and often crowded teachers’ common rooms typical of Israeli high schools). The experiment is currently taking place in a few large high schools. The original idea came from the teachers’ union but can probably be traced to wider, both professional and sociopolitical trends in Israel’s education system. These include growing concerns regarding teachers’ status, on the one hand, and the system’s unsatisfactory outputs on the other hand, coupled with increased decentralization efforts (Ayalon, 1997; Gibton, Sabar, & Goldring, 2000; Volanski & Bar-Eli, 1995; Yogev, 1997), as is happening in other countries as well (Cizek, & Ramaswamy, 1999; Whitty, Power, & Halpin, 1998). Governments, turning into regulators and enhanced drives to administer accountability mechanisms among various tiers in the system (Adams & Kirst, 1998; Ouston, Fidler, & Earley, 1998; Radnor, Ball, & Vincent, 1998), were among the catalysts that may have contributed to this idea too. The committee includes researchers on educational policy and o 19 l-49 1x/02/$ - see front matter 0 2002 Published by Elsevier Science Ltd PII: SO191-491X(02)000196
190
Book review /Studies
in Educational Evaluation 28 (2002) 189-198
administration, officials from the Ministry of Education, and representatives from the Ministry of Finance. The project’s goals are, as often happens with expensive and complicated whole-school reforms, to improve academic achievements (especially on national matriculation tests); to narrow SES-related gaps in academic performance; to reduce violence among students; to reduce student dropout rates; and to assist in reducing social exclusion and marginalization within the schools. Here are some of the questions and thoughts that have so far emerged from the committee’s meetings: Can we really correlate between the proposed change and the dependent variables in these five schools? If indeed some significant change is found, what can we learn about this reform, if and when it is implemented on a system-wide scale? Perhaps the results occurred because this is a new project and the staffs in these schools feel chosen and special? Suppose a teacher is recruited in 10 or 15 years’ time, when all schools are run according to the project’s guidelines, will s/he still work harder and feel that s/he is well paid, or perhaps, because this will now be the standard, teachers will feel underpaid and burnt-out exactly as teachers feel now? If indeed in several years’ time, many schools will be working according to this new project, and if there are going to be some changes in the dependent variables, will we really be able to know whether these changes are the result of the reform the committee is supervising? How about other numerous changes that will have occurred meanwhile in Israeli society, economy, politics; and of course, other changes and reforms in the school system as well? Finally, is this change about school culture and teacher professionalism (as most of the academics and educational researchers on the committee point out) or about school effectiveness (as the Ministry of Finance people push forward)? After all, if this is about effectiveness, all the project money could simply be used to provide the students with vouchers for private tutorials, prizes for not dropping out or for behaving themselves.. . The questions presented above are not just methodological and theoretical. They are questions regarding an evaluation plan, an evaluation model, and problems arising due to differences in the definitions of evaluation held by the various members of the committee. They are political issues as well. This new edition of Evaluation Models can assist in addressing these questions. It is a useful companion for researchers in education, and especially in educational policy and administration. This review explains why. Evaluation Models is built around five main themes: An introduction to program evaluation; Questions/methods-oriented evaluation models; improvement/accountabilityoriented evaluation models; social agenda-directed (advocacy) models; and finally: overarching matters. Chapters 2 and 3 offer the reader a comprehensive introduction to models, definitions, metaphors of, and approaches to evaluation. I strongly advise to read these two chapters as one. Chapter 2, by Madaus and Kellaghan, defines models and deals with 20 definitions of evaluation, which are themselves evaluated according to a whole set of “rationalistic to naturalistic”, viewpoints such as “modernity to postmodernity”,
Book review /Studies
in Educational Evaluution28 (2002) 189-198
191
“elementistic / reductionist to holistic” and so on (p. 26). Stufflebeam’s Chapter 3 presents and analyzes 22 approaches for program evaluation. The approaches are divided into four groups: “Pseudoevaluations”, “Questions/Methods-oriented evaluation (quasi evaluation)“, “Improvement accountability-oriented”, “Social agenda directed/Advocacy”. A very good table (p. 82) presents an analysis and evaluation of these approaches according to the Joint Committee Standards including the classics: utility, feasibility, accuracy and propriety. However, a very good table that appeared in the first edition (Stufflebeam & Webster, 1983, pp. 37-40) is missing here. These two consecutive chapters offer not only a valuable and extremely diverse overview of the field but also constitute a great practical checklist for any educational researcher. I spent a long time playing “find my approach” - a useful process of pinpointing and mapping my own work among the various approaches. I think this is compulsory reading before embarking upon any research for clients or agencies. An important part of the book is dedicated to issues related to outcome evaluation, educational standards, accountability and quality education. These are not only in the third section, which is more specifically dedicated to them, but also in various other chapters including those mentioned before about models and approaches, and in whole chapters such as Kellaghan’s and Madaus’s “Outcome Evaluation”, or Tsang’s on cost analysis. This addition to the first edition highlights some important changes in educational research, evaluation and policy in the twenty years that have elapsed since the previous edition. From a time when governments provided education, to a time when governments have become regulators and suppliers of information on education to the general public, while responsibility is being shifted from the center to the periphery of the system: to municipal authorities, and of course to school principals (Adams & Kirst, 1998; Radnor, et al., 1998; Gibton, 2001; Whiny, et al., 1998). Naturally, reviewing this book includes comparing the second edition with the first one (Madaus, Striven, & Stufflebeam, 1983). The first chapter of the first edition, entitled “Program Evaluation: A Historical Overview” is nearly identical to the chapter of the same title in the second edition. However, it has been updated: not only by means of the new section - “The age of expansion and integration 1983-2001” but also in the fine details. A comparison between these two chapters can tell a lot about what has changed in educational evaluation in these two decades. In the first edition this chapter ends with a caution regarding the paradigmatic polarization in evaluation models, between positivistic and The new chapter has a softer perspective, naturalistic paradigms and methodologies. obviously evolving from the recent abating of the positivistic-naturalistic debate and the legitimization and mainstreaming of naturalistic inquiry. However this new section should have gone further. It has a lot to say about funding and accountability but very little on evolving paradigms. Is this all that has happened between 1983 and 2001? Chapter 3, “Foundational Models for 21” Century Program Evaluation” capitalizes a bit on the second chapter in Stufflebeam and Webster’s first edition, entitled “An Analysis of Alternative Approaches to Evaluation” but includes significant updates and extensions. Some of the articles from the first edition have been left as is, have survived the two decades between these two editions quite well. Such are Tyler’s chapter on “A Rationale for Program Evaluation” and Cronbach’s on “Course Improvement”. Other chapters have been somewhat modified and updated, such as Stufflebeam’s on “The CIPP model”. Unfortunately, some chapters should have been modified further. For instance, Striven’s
192
Book review /Studies
in Educational Evaluation 26: (2002) 189-198
chapter on “Evaluation Ideologies”, while well written, deals with an issue that has seen many changes in these twenty years. Writings on the politics of social and educational research are now common and provide important insights (Denzin, 2000; Hammersley, 1995), especially for such a decision-and-policy-oriented field as educational evaluation. Some of these issues, however, are dealt with in Flinder and Eisner’s Chapter 12 on “Educational Criticism”. Here are some things I looked for in the book, but failed to find: A comprehensive chapter on ethics in evaluation; a chapter on the how and why of writing evaluation reports; a chapter on school-based evaluation; and perhaps some more on the evaluator’s role in different settings - though this area is partly covered in Patton’s Chapter 23 on utilization. A final comment on the volume’s organization is that although the title refers to “. . .Human Services Evaluation”, examples from professions outside education are scarce (some of these can be found in Roger and Stufflebeam’s chapter). Readers can of course draw some inferences by themselves, but still an educator will feel much more comfortable with this book than those who work in other human services professions, such as social work or labor studies. In my experience it is the combination of good theory and of relevant examples that make a book user-friendly and gives theory its power. Only if there would be no specific books on evaluation available on other professions would I recommend this book to their members, as a default. The book may be helpful to those who are already into evaluation in these fields. It is less useful for beginners who need more scaffolding in the form of specific and detailed examples from their field. I think that we, as educational researchers, should be extremely careful not to do to others what has been done to us for a long time, i.e., assuming that models and theory are universal. As Bottery notes: “When an area is weak in its own theory, it is prey to invasion from the theories of other, more thoroughly worked, areas” (1992, p. 2 1). We seem to have enough evidence to suggest that many of the virtues, as well as of the problems, of education and educational research are unique and specific (Ball, 1987), so our research methods and literature are probably at their best and at their highest level of usefulness and of validity, when applied in specifically educational settings. Using Literature
on Evaluation
and Evaluation
Theory in Educational
Research
So now back to the program our committee is evaluating. Table 1, presented below, allows a glimpse of how three chapters of “Evaluation Methods” can be used in the actual evaluation of the project. Each of these three chapters offers different insights relevant to the project we are supervising. “The CIPP model” presents a specific method that can be used to evaluate the project. The chapter on “Outcome Evaluation” is helpful because this type of change is, at least in the committee’s preliminary opinion, an outcome issue. Somewhat contrary to this, the chapter on “Democratic Evaluation”, presents a client-based, Of course many other chapters in the rather than expert-based, empowerment approach. book carry important insights for our project. Steinmetz’s Chapter 7 on the discrepancy model is important because at the end of the day, no matter how much data are gathered, the evaluators may not be capable of delivering a straightforward answer to the question whether this humongous and expensive change will indeed have significant impact on the dependent variables.
8 -i $ 2 4
< F? 2 .g
Table 1:
-
L -
- II -
-
,,. . -
-
Stufflebeam’s Chapter on the CIPP Model for Evaluation A redefinition of needs is required so as to have a broader spectrum of possible outcomes. A deep analysis of characteristics of change within the schools may help in isolating variables that influence results.
_-
- -
,~.
-
Kellaghan and Madaus’s Chapter on Outcome Evaluation Perhaps, if a suitable set of performance indicators (p. 102) is developed for this project. These do not necessarily have to be the actual dependent variables that the project wishes to influence but rather they should be able to point at a possibility of change among and within these variables. The term “complex of programs” (p, 98) is useful to describe this type of change. Longitudinal study of individual students may help on this point.
Using Three Chapters from the Book to Answer Some Questions on the Project
Question Regarding the Project Can we really correlate between the proposed change and the dependent variables in these five schools?
If indeed some significant change is found, what can we learn about this reform, if it will be implemented on a system-wide scale?
Context evaluation techniques can assist in answering this question. With these we can uncover what parts or how much of the change can be attributed to the fundamental aspects of the project, and how many are due to the uniqueness and limited scope of the project. Input evaluation can be used to improve the project as it is implemented, as a safeguard that may minimize the diminishing effect, if and when the project is implemented on a larger scale. (p. 291).
- ,,
.-
.I ._L.
-
_
House and Howe’s Chapter on Deliberative Democratic Evaluation There should a more detailed mapping of the various stakeholders (p. 414), in order to ensure that power among them is really balanced. This should be done so that the correlation is really the issue, and that there isn’t a hidden agenda among the various groups taking part in the evaluation effort. For instance, the schools themselves are not represented, and neither are groups of citizens. A good way to tackle this issue would be to involve acting school principals in the evaluation program, either as assistant-researchers or as focus groups. At any point, practitioners can be involved in the evaluation process (p. 416) that so far is a classic topdown approach.
Table l/cant.
2 7 S 3
g c z
5
6 ‘;f -S % 4 ?3 6 ._ S Ir, -1
:: .-. 2 ? & 5 2 % 4
Table 1 (cont.) Question Regarding the Project Perhaps the results occurred because this is a new project and the staffs in these schools feel chosen and special?
If indeed in several years, there will be some changes in the dependent variables, will we really be able to know whether these changes are the result of the reform the committee is supervising? How about other numerous changes that will occur in Israeli society, economy, politics and of course other changes and reforms in the school system as well? Finally is this change about school culture and teacher professionalism or about school effectiveness?
Kellaghan and Madaus’s Chapter on Outcome Evaluation Information on performance management (p. 103) can help.
Context evaluation is important to see how the project affects the environment. In this case the target will be to inquire how the change in school culture and climate affect the lives of students, staff, and the school community etc. (p. 287)
Longitudinal studv of individual students may help on this point too. The concept of “net” results versus “gross” impacts (p. 108) may assist in differentiating between the affects of the project we are studying and other impacts that evolve from other educational reforms, and sociopolitical trends in Israeli society.
Stufflebeam’s Chapter on the CIPP Model for Evaluation Historical context evaluation in other projects is useful too. Perhaps we can learn from other projects (after mapping their characteristics) on what may or might happen in this one.
The Advocacv Teams Techniaue is “especially applicable in situations where institutions lack effective means to meet specified needs and where stakeholders hold opposing views on what strategy the institution should adopt” (p. 293)
Definitelv an outcome issue. Trends in educational policy and policy analysis increase the danger that results and goals will be displaced (p. 1 IO) especially in high-stake situations as this project, which is system wide and extremely costly. “When meeting standards becomes the basis for budgetary decisions, there is the further consequence that programs that meet standards, rather than program goals, may be continued, while programs that meet goals, but not standards, may be
House and Howe’s Chapter on Deliberative Democratic Evaluation Deliberation through reflective practice (p. 416) can help in shedding some light on this issue. Asking members of staff to write diaries or shorter forms of narratives, with complete anonymity secured, may disclose inner feelings, thoughts, and fears regarding the project. (p. 4 17)
One should comnletelv reframe the question and inquire how will this change benefit low-income families, ethnic groups. How can we promise that the social goods are distributed fairly?
Book review /Studm
in Educational Evaluation 28 (2002) 189-198
195
These were, as mentioned before: achievement (especially among low-income students), attendance, dropout rates, access to academic studies, violence, etc. In other words, the results of the evaluation will not solve the problem of whether this is “worth the expense”. To this, the head of the committee, Prof. Dan Inbar from the Hebrew University in Jerusalem, a wise researcher on educational policy, said to the finance people: “Sorry, but I cannot answer this question which is basically moral or ethical. How can one determine whether getting a low-SES kid into university is more important to society than buying a new dialysis machine for patients with kidney failure?” The chapters on educational criticism (by Flinders and Eisner) and on evaluation ideologies (by Striven) are helpful too. They can shed light on the politics of our project, the “who gains - who loses” issues, and allow us, as supervisors, to pose a mirror vis-a-vis our own work, telling us how we perceive our own approach to evaluation. The chapter on field trials (by Nave, Miech, and Mosteller) is relevant, as is evident from its title. The chapter on cost analysis for improved educational policy making and evaluation (by Tsang) may be especially helpml and provide This last chapter is important to educational useful insights on this issue as well. researchers involved in arguments with economists. It provides definitions and distinctions between such terms as “resource utilization”, “cost functions”, “cost output”, “cost benefit”, and their relation to educational evaluation. The chapters on and “cost effectiveness”, utilization (by Patton) and on empowerment (by Fetterman) tell about how such evaluation can be less top-down or professional-layperson oriented and may involve wider audiences and groups of citizens, both in planning the evaluation process and in implementing it. Are these tools used by researchers-turned-evaluators? Going through a sample of 1999-2000 issues of the Journal of Educational Evaluation and Policy Analysis (EEPA) produced several interesting findings regarding educational evaluation. First, although methodology sections in the journal are quite elaborate and detailed, hardly any articles in these two years addressed issues of evaluation in their methodology section. This is especially striking considering that writers go to considerable length in presenting their is qualitative (I’m methodology, sampling, tools etc., also when their methodology emphasizing this because qualitative research tends, unfortunately, to be more vague about methodology and study layouts). Second, cited references to educational evaluation are scarce too. Even when quoting scholars who are authorities on educational evaluation, the references quoted are those reporting specific empirical studies and not those dealing with The sample includes an article I published with some theory and practice of evaluation. colleagues of mine.. . This is by no means meant to criticize the EEPA or its contributors. It is merely an observation on the role of educational evaluation, as a field of study, and of educational evaluation theory in educational research, which is many times equivalent to educational evaluation. Shouldn’t educational researchers present, in their articles, issues regarding educational evaluation as presented in this new handbook? I mean issues such as: What model or what approach of evaluation was utilized in this study? What other approaches or models were considered but rejected, and why? Was a specific theory such as the CIPP model (by Stufflebeam, second edition) or Outcome Evaluation (by Kellaghan and Madaus) or Program theory (by Rogers), etc., used, and how? How did clients respond to evaluation? How were they involved? And this is what happens in a very good journal that has the word “evaluation” in its title! So perhaps evaluation is a bit like ethics: Very important, but seldom reported. This doesn’t mean that researchers do not take it into
196
Book review /Studies
in Educational Evaluation 28 (2002) 189-198
account but it’s not a linchpin of the study, and certainly not at the heart of reporting about the study as part of getting accreditation from peers, in the same way as are research tools, sampling and data analysis. But evaluation is after all, in a hidden way, all about reliability and validity. Reading the articles in this handbook puts you in mind of how good research is supposed to be done. Because choosing the appropriate evaluation model and considering aspects of the evaluator’s role, political aspects of evaluation, etc., all this has everything to do with the rigor, validity and reliability of educational research. Why are evaluation models not used more often? One possible explanation could be that “academic” and “practical” or “client-based” types of research tend to get mixed up. When researchers are asked to evaluate something they are focused on methodological issues. So when academics do “evaluation” they think they are doing “research”. Perhaps they see “research” as their primary objective and the thing they do best. Clients also pick evaluators according to their reputation as researchers, and rightly so. Therefore many researchers see themselves as good evaluators as well. People who are actively involved in client-based research-turned-evaluation probably do accumulate intuitive experience and knowledge on evaluation. But this book provides systematic knowledge, which is a different ball game. Evaluation is not just research. It is the organizing and management of research. It is the politics of research; it is the human and other resource-management of research; and it is the meta-cognition of how research methodology translates into the actual daily practice of an administrator. Another explanation is that evaluation theories have not succeeded in acquiring a primary position at the heart of educational research. They have been sidetracked to a kind of “sub- expertise” within educational research or within the areas of policy analysis or of research methods. This is notwithstanding the fact that nearly every empirical educational researcher should show reasonable capabilities and knowledge in evaluation (and I don’t mean the introductory undergraduate course that we all took years ago.. .). I think evaluation has a primary role in bringing together policy makers and schoollevel operatives: mainly school principals and school staffs but also boards of governors in schools and counties. Evaluation theory can and should play a major role in defining accountability and in the use of educational research by professionals in the field. Therefore, evaluation theory should offer further tools for policy analysis especially the analysis of standards movements and decentralization processes (Lewis, 1999; Nevo, 1995). Evaluation theory should adapt itself to changing school systems and focus on issues of values and politics. Understanding the intricacies of decentralized systems poses Some of the chapters in the new book, such as new challenges to evaluation theory. Striven’s, Kirst’s and Patton’s, deal with these issues, but I think that what is required now are new models that can assist evaluators and others to actually evaluate these new trends in educational policy and the structure of educational systems. Last but not least, evaluation Researchers should be offered not theory must adopt multi-model or eclectic approaches. only well-phrased distinct models and approaches but they should also be encouraged to build and use multi-faceted and combined evaluation processes that draw upon several models, approaches and persuasions. In a recent departmental meeting, I found myself looking around the room and thinking we had a typical range of staff: some educational sociologists, some people who
Book review /Studies
in Educational Evuluation 28 (2002) 189-198
191
are into educational administration and leadership, some doing educational economics and law, some history and philosophy of education (often referred to as “foundations of education”), and, of course, some working on educational evaluation. But then something quite obvious occurred to me: we’re all doing evaluation! Whether in “pure” research, and of course especially when requested by authorities, municipalities, schools, or research foundations and funds, to study an issue, follow a new program, reform or change, or assist a client in reaching a decision. So what is the role of this field of study known as “evaluation”? Is it a serendipitous area we may or may not visit, such as when writing a review on a new book? Is it another name for educational research methods? The new edition of Evaluation Methods supplies some good updated answers, and poses some new questions and dilemmas as well. References Adams, results
J.E.,
& Kirst,
in an era of excellence.
Association,
M.W.
(1998).
Ayalon,
H. (1997).
In R. Shapira
perspective
(pp. 177-201).
School & P.W.
Oxford:
autonomy Cookson
Cizek,
G.J.,
solutions.
educational
accountability:
of the American
Striving
Educational
for
Research
in a centralized (Eds.),
system:
Autonomy
The
and choice
case
of Israeli
in context:
secondary
An international
& Ramaswamy, In G.J. Cizek
of the school.
V. (1999).
(Ed.),
London:
American
Handbook
Routledge.
educational
of educational
policy
policy:
Constructing
(pp. 498-522).
crises
San-Diego,
and CA:
Press. Bottery,
perspectives
Denzin,
Gibton education”..
M. (I 992).
N.K. (2000).
D. (2001).
Insights
from
research
management:
Personal,
social
and political
Cassell. and politics
of interpretation.
(2nd ed.) (pp. 897-922).
“Once
the government
provided
what
UK headteachers
think
nnd autonomy.
Management,
of educutional
London:
The practices
of quulitative
selfmanagement
Leadership,
The ethics
on school organizution.
(Eds.), Handbook
policy,
for
meeting
Pergamon.
Ball, S.J. (I 987). The micro-politics
Academic
at the annual
April 13”’ - 17”‘. San Diego, California.
education.
crafting
New demands
Paper presented
Paper presented
and Administration
Society
Thousand education.
of educationul
In N.K.
Denzin
& Y.S. Lincoln
Oaks, CA: Sage. Now
it provides
information
law regarding
at the annual
conference
(BELMAS),
Newport-Pagnell,
on
decentralization
of the British UK,
Educational 5-7 October
2001. Gibton,
D., Sabar,
view implementation Evaluation
und Policy iinalysis,
Hammersley, Madaus, educational
N., & Goldring,
of decentralization
G.F.,
M.
E.B. (2000). and restructuring
How principals policy:
Risks,
of autonomous rights
schools
and wrongs.
in Israel
Educational
22 (2) 193-2 IO.
(I 995). The politics of social research. London: Sage.
Striven,
and human services
M., & Stufflebeam, evaluation.
Hingham,
D.L. (Eds.) (1983). MA: Kluwer-Nijhoff.
Evaluation
models:
Viewpoints
on
198
Bookreview /Sr~dies
in Educational Evuluation 28 (2002) 189-198
Nevo, D. (1995). School-based Pergamon.
evaluation: A dialogue for school improvement.
Oxford, UK:
Ouston, J., Fidler, B., & Earley, P. (1998). The educational accountability of schools in England and Wales. In R.J.S. Macpherson (Ed.), The politics of accountability: Educative and international perspectives (pp. 107-l 19). Thousand Oaks, CA: Corwin. Radnor, H.A., Ball, S.J., & Vincent, C. (1998). Local educational governance: Accountability and democracy in the United Kingdom. In R.J.S. Macpherson (Ed.), The politics of accountability: Educative and international perspectives (pp. 120-133). Thousand Oaks, CA: Corwin. Volansky, A., & Bar-Eli, D. (1995). Towards school-based Leadership, 53 (4).
management
in Israel. Educntional
Whitty, G., Power, S., & Halpin, D. (1998). Devolution nnd choice in education: The school, the state and the market. Buckingham, UK: Open University Press. Yogev, A. (1997). Autonomy and choice as school strategies for peripheral communities in Israel. In R. Shapira & P.W. Cookson (Eds.), Autonomy and choice in context: An international perspective (pp. 177-201). Oxford: Pergamon.
The Author
DAN GIBTON teaches educational administration law and policy at the Department of Educational Policy and Organization, School of Education, Tel Aviv University. His areas of interest include educational leadership, law-based reform and the use of educational law in policy implementation, and qualitative research on school principals. Recent work includes ethnography in schools attempting to uncover how educational leadership is influenced by educational policy and law-based reform. Correspondence: