Studies
in Educational
Pergamon
Evaluation, Printed
Vol. 21, pp. 105-l 10, 1995 0 1995 Elsevier Science Ltd in Great B&in. All lights resewed. 0191-491w95 $29.00
0191-491X(95)00008-9
COMMON
INTRODUCTION: GROUND FOR A UNIFIED Andrew McConney’
The Evaluation
Center,
Western
Michigan
University,
APPROACH
r2 Kalamazoo,
Michigan,
U.S.A.
This issue comprises four papers on educational personnel and school evaluation. The papers were developed as part of the work of CREATE’s Cross-Cutting Theory Project over a series of meetings in the fall and winter of 1993-94. Earlier versions were first presented at a special invitational symposium in New Orleans on April 5, 1994, during the annual meeting of the American Educational Research Association; the symposium was organized to gain feedback from leading evaluation specialists on our progress toward developing a unified model of educational personnel and school evaluation. In this issue, each author approaches the development of a unified model of educational personnel evaluation from a different perspective in line with the research (or practice) in which each is currently engaged (&riven-teachers; Stronge-support personnel; Stufflebeam-superintendents; Webster-schools and teachers). Each gives emphasis to different aspects of evaluation systems (Striven-merit versus worth, professional duties as generic standards; Strong-balancing individual and institutional demands; Stufflebeam-applying system standards; Webster-using objective student data to connect school and personnel evaluation). However, as I have found, and think that you will find, there is much common ground among the four. It is this common ground that we feel forms (at least a significant part of) the essential jiiundutions of educational personnel evaluation. My responsibility is to briefly outline what I see as the common ground-the foundations on which we can build a unified model-among the four papers. It is equally my job to point out the differences-if any-among the various positions expressed in these papers. I hope that you find this analysis and the papers that follow helpful and thought-provoking. As James Stronge did in his contribution to this special issue, and to help us better understand our purpose here, a useful question to ask may be: What do we mean by a “model”? The American Heritage Dictionary (third edition) defines a model as “a schematic description of a system, theory, or phenomenon that accounts for its known or inferred properties and may be used for further study of its characteristics.” This description is, I think, what we’re after: a picture of a concept or set of concepts that has both explanatory and predictive power while maintaining parsimony, the latter including both simplicity and elegance. In our search for a unified model of educational personnel 105
106
A. McConney
and school evaluation, the most important seems parsimony: a picture simple enough to be easily understood, yet elegantly flexible enough to be useful across a considerable range of scenarios. To provide evaluation algorithms at the level of what Michael Striven calls the “coal-face” is not the purpose of this development effort. After all, our intention in using the phrase “unified model” is that such a model may be applied equally well to design evaluation systems for teachers, school and district administrators, a variety of professional support personnel (and indeed schools). Thus, the common ground that I seek in looking across the four papers is at the model-level of an evaluation system, in effect the “big picture” of the fundamental characteristics a unified model of educational personnel evaluation might have. It is clearly evident in looking across the four papers that the authors all suggest as a first step in any evaluation system an assessment of institutional needs and resources in the context of the institution’s mission (a school’s needs ideally being a reflection of the needs of society and its clients - students and parents). As Stronge points out “determining the needs of the organization is a prerequisite for all remaining steps if the evaluation process is to be relevant to the organization’s mission and, ultimately, responsive to public demands for accountability.” From the assessment of needs, resources, and environmental (work) context it follows that evaluation systems must assess worth, in addition to merit (for a treatment of the difference between merit and worth, see Striven’s paper). As Striven points out, jobs are created through the assessment of institutional needs and resources and personnel evaluation should therefore also include the worth yardstick, so that schools can more effectively meet the needs of their clients. Stronge also expresses this position in his argument for aligning individual and institutional goals, as does Stufflebeam in his caution to those responsible for hiring and evaluating superintendents. If evaluation is to serve both institutional and individual goals, then educational evaluation systems must include evaluations of both extrinsic (worth) and intrinsic (merit) value. Beyond the assessment of institutional needs, resources, and work context, a second commonalty that cuts across the four papers is the necessary step of delineating professional duties and responsibilities as the generic standards that define each educational profession. It is these duties and responsibilities that will form the basis for determining the criteria (behaviors) by which the professionals performance will ultimately be judged. Thus it is important that the duties list be carefully and collaboratively decided. Stufflebeam has here provided such a list for superintendents (and elsewhere for school principals), while Stronge has done so for a variety of professional support personnel, and Striven for teachers. It does not seem to me critically important where the duties list for each profession originates, as positions may differ quite significantly depending on the specifics of the setting (as Striven points out for instance, job-specific and site-specific duties must be added on to generic professional duties to move the evaluation system to the What is important is that the list be comprehensive, clear, and arrived at coal-face). collaboratively with input from all stakeholder groups; this will foster ownership of the evaluation system. Once the generic, job-specific, and site-specific duties have been collaboratively agreed on, there is broad consensus among this special issue’s authors that an essential subsequent step must be the determination of performance criteria (measurable behaviors
introduction
107
representative of the job), criteria weighting (relative importance of each criterion to the aggregate evaluation), and criteria standards (“cut-scores” or standards that delineate exemplary, satisfactory, or unsatisfactory performance for each performance criterion). Again, the most crucial issue is not which performance criteria are chosen, or what weight and cut-score each is assigned, as all of these will vary depending on site context. What is crucial is that performance criteria, weights, and cut-scores be determined a priori, that they be appropriately representative of the job and work environment, and that evaluatees be fully aware of them through effective communication by the evaluators. A fourth aspect of evaluation systems suggested and supported by each of the authors contributing to this volume is the necessity for using multiple sources of data in the evaluation of school professionals. This is a basic and central principle of educational measurement in that any one data source or instance of measurement is simply one sumple of behavior, and the greater the variety and number of samples taken the better (more reliable) the representation of performance over time. Of course, care needs to be taken here as the addition of several poor measures may negate a good measure. Thus, each sample of performance must be collected by valid measurement. However basic to educational measurement, this is not a trivial issue for systems of educational personnel evaluation. As Striven points out, and as has been widely reported (e.g., in the National Center for Educational Statistics Fast Response Survey System report: Public Elementary Teachers’ Views on Teacher Performance Evaluations, NCES, 1994) by far the predominant model for evaluating school professionals’ performance is the “inspection model”, a system relying exclusively on a tiny number of work observations many of which are preannounced. There is consensus among the authors that such a system is wholly inadequate. As Stufflebeam, Webster, and Stronge point out (and Striven implies) a variety of data sources must be employed for the reliable and valid representation of school professionals’ performance, including: objective data (e.g., student achievement scores, student ratings), interviews with evaluatees and clients of evaluatees, observation (judgment), and performance artifacts (work portfolios containing, for instance, lesson plans, sample assessments, records of professional development activities, performance logs, service records, etc.). Fifth, it is evident that these researchers find common ground in asserting that professional performance evaluation must serve both formative and summative purposes. The formative purpose is essential to facilitate opportunities for the evaluatee to improve her/his practice in line with the mission of the school. The summative purpose on the other hand, is essential in assuring accountability to the school’s clients, and in allowing the school to effectively meet its goals as defined by its constituent stakeholders. Last, and perhaps most importantly, the four papers share the critical requirement that every evaluation system must be in accord with an accepted set of evaluation system standards. The authors agree that the appropriate system-level standards by which all evaluation systems should be assessed are The Personnel Evaluation Standards of The Joint Committee on Standards for Educational Evaluation (1988). The Standards, approved by the American National Standards Institute, are widely recognized as the authoritative work on guidelines for assessing systems of personnel evaluation, and include the four basic sets of standards: propriety, utility, feasibility, and accuracy. These are dealt
108
A. McConney
with comprehensively by Stufflebeam, Stronge, and Webster and do not require elaboration here. In sum then, to this reader the four papers that make up this special issue hold the following principles of educational personnel evaluation in common: 1.
An assessment of institutional needs, resources, and work environment must be an early step in any evaluation system.
2.
A clearly delineated and comprehensive set of professional duties and responsibilities (the characteristics that make any one profession distinct from any other) must be the operational basis for defining performance evaluation criteria.
3.
Valid systems must include exactness in specifying acceptable performance at the ground level. Together, needs/resources assessments and professional duties lists will allow the clear definition of performance criteria and facilitate the determination of criteria weights and acceptable levels of performance.
4.
As in the principles of educational and psychological measurement applied to the primary clients of our schools, the measurement of performance for school professionals must utilize multiple data sources, and multiple instances of data collection.
5.
Evaluation
systems must serve both formative (professional development) and purposes. While providing the professional the opportunity to improve so that he/she is in a better position to assist in achieving the school’s goals, this ensures that students’ and parents’ rights to appropriate educational services are protected.
summative (accountability)
6.
The Personnel Evaluation Standards of The Joint Committee
Educational Evaluation are the appropriate evaluation systems for school professionals.
and authoritative
on Standards for guide for assessing
In their papers, Stufflebeam and Webster raise an issue deserving further consideration as a source of possible departure among the four researchers. This is the issue of linking (or reciprocity) in the performance evaluation of school professionals. For Webster the evaluation of the principal should be coupled to the evaluation of...teachers just as the evaluation of the superintendent should be coupled to the evaluation of principals. To the extent possible, many of the same outcomes should be used for both sets of evaluations. Stufflebeam is not as explicit in calling for the evaluation of school professionals to be so tightly linked. He does state, however, that superintendents should be evaluated based on the outcomes of their administrations (context and product evaluation), for instance on the achievement test scores of minority students. This suggests some level of agreement with the concept that student outcomes, such as objective achievement data, will
Introduction
109
provide a ripple effect influencing the performance evaluations of not only teachers at the ground level, but also support personnel and school and district administrators. In their papers Striven and Strange do not speak to this issue, although Striven, as an early proponent of the idea, has spoken in strong support for the concept. Beyond this departure, the differences evident among the four authors are to my mind trivial. They may appear as implementation detail in terms of the number of steps required to get the evaluation job done. Or, in terms of the different emphasesgiven to the sequencing of the necessary steps. As I indicated at the outset, these are algorithmic details important and necessary in the design and implementation of any evaluation system, but not critically important to our purpose here. The consensusof evaluation principles that I see in these four papers, therefore, makes me highly optimistic that we have to a significant degree identified the necessary foundations for construction of a broadly applicable unified model of educational personnel evaluation. Notes I am deeply indebted, and here express my gratitude to Michael Striven, James Stronge, Daniel Stufflebeam and William Webster for their hard work in producing the papers in this volume. Their efforts have been as part of the work of the Cross-Cutting Theory Project of the national Center for Research on Educational Accountability and Teacher Evaluation (CREATE) to develop a theoretically sound and practically useful unified model of educational evaluation. I have been highly privileged to work with these authors, who together comprise a brain trust in educational evaluation that one would be hard pressed to match in this country, or indeed internationally. I also thank Ms. Rebecca Thomas, research staff for this project, who provided keen insight into the practicesof teacherevaluation,and who doggedly reminded me of the importance of ensuring that any attempt at a unified model be singularly feasible in its implementation. Finally, I must acknowledge and thank Arlen Gullickson, Associate Director of CREATE and Evaluation Center Chief of Staff, for his useful critical reviews of the four papers herein contained, and for his always helpful ideas on improving the work of this project. We request your input on the special issue, its coverage of the relevant issues and its usefulness to you as a reference work, or in causing you to give further thought to educational personnel and school evaluation. Once you have used this volume, please drop us a note or call us. We want to keep improving the ways that we share our work with the constituencies we serve, and part of this requires your help. Please send your comments, questions or suggestions to: Dr. Andrew McConney, ProjectDirector, CREATE’s Cross-Cutting Theory Project, The Evaluation Center, WesternMichiganUniversity,Kalamazoo,MI 49008-5178, Tel: (616) 387-5895 / Fax: (616) 3876666, E-mail:
[email protected]
References Joint Committee on Standards for Educational Newbury Park, CA: Sage. National
Evaluation (1988).
The personnel
evaltmtion stundarcis.
Center for Educational Statistics (1994). Public elementary teachers’ views on teucher performance evaluations (NCES 94-097). Washington, DC: U.S. Government Printing Office.
110
A. McConney
The Author ANDREW McCONNEY is Project Director of The Cross-Cutting Theory Project of the Center for Research on Educational Accountability and Teacher Evaluation (CREATE) at Western Michigan University’s Evaluation Center. Dr. McConney gained his doctorate in science education at Florida Institute of Technology in 1992, and his research interests include the evaluation of educational research using secondary data analysis, and the performance assessment of students and teachers.