Case studies of evaluation utilization in gifted education

Case studies of evaluation utilization in gifted education

Articles Case Studies of Evaluation Utilization in Gifted Education CAROL TOMLINSON, LORI BLAND, TONYA MOON, and CAROLYN CALLAHAN The findings do no...

1MB Sizes 5 Downloads 70 Views

Articles

Case Studies of Evaluation Utilization in Gifted Education CAROL TOMLINSON, LORI BLAND, TONYA MOON, and CAROLYN CALLAHAN

The findings do not reflect the positions or policiesof the Office of Educational Research and Improvement or the United States Department of Education. INTRODUCTION Numerous reasons exist for evaluations, among them: improving effectiveness of programs and program personnel, reducing uncertainties, assisting with decision-making and goalsetting, seeking justification for decisions, meeting legal requirements, fostering public relations, enhancing the professional stature of the evaluator or program administrator, boosting staff morale, mustering program support, and changing policy, law or procedure (Alkin, 1980; Bissell, 1979; Mathis, 1980; Ostrander, Goldstein, 8c Hull, 1978; Raizen & Rossi, 1981). Nonetheless, the literature of education is replete with examples of evaluation findings which never resulted in program enhancement, improvement, or development. Disregard for findings of educational evaluation is costly in effort, monies, and in human terms when potential program improvements are stillborn (Datta, 1979; King, Thompson, & Pechman, 1981). Because of a general lack of public understanding of and support for programs for the gifted, and keen competition for scarce resources, the survival of programs for gifted learners may depend on carefully planned evaluations which yield useful information that can be translated into documentation of effectiveness and action to improve programs by educational decision makers (Dettmer, 1985; Renzulli, 1984). Smith (1981) calls for developing more information about evaluation practice through use of both conceptual reviews and empirical study. A review of literature (Tomlinson, Bland, & Moon, 1993), including general evaluation utilization and literature relating to evaluation utilization in the field of gifted education, provided a conceptual base for the Carol TotnIInson, LorI BIand, Tonya Moon, and Carolyn CaIIahan l Curry School of Education, Department of Educational Studies, University of Virginia, 405 Emmet St., Charlottesville, VA 22903-2495. Copyright @ 1994 by JAI Press, Inc.

Evaluation Practice, Vol. IS, No. 2, 1994, pp. 153-168.

All rights of reproduction in any form reserved.

ISSN: 0886-1633

153

EVALUATION PRACTICE, 15(2), 1994

154

study reported in this article. The review delineates factors affecting evaluation utilization including internal and external factors, message source, message content, and message receiver. These factors served as an organizer for investigation in the study reported here which uses empirical methods to compare evaluation designs and practices used in programs for the gifted in a variety of school districts. The result is presentation of profiles of evaluation utilization which have been called for (Smith, 1981), but which have been scant in the literature of evaluation in general (examples of such studies are, Dawson & D’Amico, 1985; and Mutschler, 1984) and absent in the literature of evaluation of programs for the gifted in particular. The research reported here is unique because of its use of multiple cases in an empirical study of evaluation utilization. A study by Hunsaker and Callahan (1993) provided a conceptual framework for the study reported here. They described the current state of practice in evaluating programs for the gifted, and factors associated with “strong” and “weak” evaluation designs and practices in gifted programs. Hunsaker and Callahan described as weaker those reports which: 1) were disseminated solely to district administrators as opposed to broader stakeholder audiences, 2) failed to include recommendations for action; and 3) lacked apparent mechanisms for translating findings into action. Reports categorized as stronger were disseminated more broadly, included recommendations for action, and outlined mechanisms for translating findings into positive program change. Based on the Hunsaker and Callahan findings and categories, the current study sought to determine the degree to which representative evaluation reports from districts using stronger and weaker practices generally adhered to utility standards outlined by the Joint Committee on Standards for Educational Evaluation (1981), and the degree to which reports from stronger and weaker districts were utilized for positive program change. METHODS Background and Selection of Sites for Study Hunsaker and Callahan (1993) collected several hundred evaluation reports on programs for the gifted from educational data bases, an appeal through professional journals, and direct mail requests to state-level gifted coordinators and over 5,000 school districts. While many of the reports received consisted only of program descriptions and/ or evaluation instruments, 70 reports also contained evaluation plans and evaluation results. These seventy reports served as an initial pool of cases from which researchers in the current study selected sites for investigation in regard to evaluation utilization. In a first sort of reports from the initial pool of seventy, Hunsaker and Callahan divided reports according to those giving no recommendations for program change, those giving recommendations, and those going beyond recommendations toward implementation by forming implementation committees, developing policies to support implementation, and implementing suggested changes. Those giving no recommendations

Evaluation Designsand Practices

155

were considered examples of “weak” practice, while those going beyond recommendations toward implementation were considered examples of “strong” practice. Within the two categories of “weak” and “strong” practice, Hunsaker and Callahan conducted a second sort according to range of evaluation audiences of reports, applying the researchers’ belief that dissemination to a broader range of audiences is more useful than dissemination to a narrower set of stakeholders. Reports highlighted by this sort were then arranged in chronological order, with the six most recently conducted evaluations in the “strong” and “weak” categories given preference for study based on the pragmatic conclusion that the more recent the evaluation, the more valuable it would be in conducting a case study because of the likelihood that key personnel involved in the evaluation process would still be available, and that their recollection of events would be more complete. Researchers used the 12 exemplar districts as sites for the current study. Six cases served as examples of “strong” practice, and six as cases of “weak” practice. The 12 represented great diversity in geography (midatlantic, northeast, midwest, west coast), size (from a district with only three schools to a district with 179 schools), and program design (including differentiation in the regular classroom, pullout programs, schools within schools, separate classes, schoolwide enrichment models, or combinations of delivery systems). Three university researchers each interviewed persons from four school districts. One researcher worked with the four “strongest” districts and one with persons from the four “weakest” districts. A third researcher was blind to the strong/ weak labeling throughout the interview process, in order to serve as a check on the method used to rate districts. This researcher interviewed two districts from the “strong” category and two from the “weak” category. Two of her districts were “weakest of the strong” and two were “strongest of the weak,” creating in essence a “middle” category. Defmition

For purposes of this study, evaluation utility was defined as use of formative and/ or summative evaluation information to affect a program for gifted learners in at least one of three ways: altering ways in which program participants, evaluation audiences and/ or decision-makers thought about the program; changing the decision-making process and/ or decisions made by stakeholders in the program; or invoking some action regarding implementation of the program. Data Collection and Analysis

Initial contact for this study was made by sending letters to school superintendents and contact persons in the 12 selected school districts, asking for cooperation in the study. Phone calls were then made to district contact persons to determine key participants in the evaluation process (e.g., program evaluators, coordinators of gifted programs, teachers in the gifted program, general classroom teachers) and to arrange for initial interviews. Additional respondents were also identified from evaluation reports or by initial interviewees as the study progressed. In two school districts, only one participant was available. In each of the others, between two and seven interviewees participated. Telephone interviews were conducted in two phases by three university researchers with training and experience in qualitative research and program evaluation. Initially, interviewers used a four-question interview protocol:

EVALUATION I’RACTKE, 15(2), 1994

1) Tell me about the process your district used to evaluate the gifted program. 2) What were the outcomes of the evaluation? 3) How did the evaluation process affect the thinking of district personnel about or their planning for the programs for the gifted; and 4) How was information gathered through the evaluation process used? Researchers asked followup questions to extend and clarify responses to the initial questions. A second round of interviews followed with questions derived from the utility standards in the Standards for Evaluations of Educational Programs, Projects, and Materials (Joint Committee on Standards for Educational Evaluation, 1981) with followup questions used to clarify informants’ answers (see Figure 1). As interviews were conducted, summaries were sent to informants for verification or modification as necessary. Following all interviews and member checks, content analysis of interviews was conducted, with an informant’s complete interview serving as a coding

Audience Identtjkation:

Evaluator Credibility:

Information Scope and Sequence: Valuational Interpretation:

Report Clarity:

Report Dissemination: Report

Timeliness:

Evaluation Impact:

Other:

When you planned your gifted program evaluation, did you involve particular groups in the planning process so they would be more aware of the program and its evaluation? If so, who, how? When you planned your evaluation, did you talk about who might need or use the results? Can you give me some examples of such groups and how you planned the evaluations to ensure that findings would be useful to these groups? What thoughts did you have about the qualifications or requirements of people who might plan or conduct or report findings of your gifted program evaluations? What plans did you make for determining questions you asked, who you asked, and how much data you collected in evaluation of your program for the gifted? How did you decide the ways in which you interpreted information collected? How did you share these methods of interpretation with others in your division or community? How did you report out your findings? (If there is a formal written or oral report) What did you include in the report? How did you decide who should be told about findings of the gifted program evaluation? Who was told? What was the turnaround time between conducting the gifted program evaluation and sharing findings with people who received them? Can you describe ways in which you interacted with different groups in your district or community to encourage that action be taken as a result of your findings? What else do you feel we should know about processes, procedures, or issues which arise in your district as you evaluate programs for the gifted and use findings from these evaluations? Figure

1.

Round

Two Interview

Protocol

EvaluationDesignsand

157

Pmctim

unit, and using pre-ordinate and emergent categories. Pre-ordinate categories included factors suggested by the literature as impacting use of evaluation findings and factors suggested to be important in the related study as referenced earlier (Hunsaker t Callahan, 1993). Emergent categories were those which were repeated within and among the interviews (e.g., informal evaluation, committee involvement, changes recommended, changes made, etc.) Information was aggregated first for the three interview categories separately (strong, blind, and weak). This resulted in separate profiles of and factors distinguishing “strong” and “weak” districts related to their use of evaluation findings. A district’s evaluation documents were reviewed prior to interviews in order to develop a basis for followup questions, and following interviews for triangulation of information with the interviews. Additional triangulation was obtained by interviewing several people in most school districts and by interviewing several districts in the both the “strong” and “weak” categories. RESULTS Commonalities among the Groups There are three points which should be made regarding commonalities among the three groups. 1. It is important to note that the “blind group” did, indeed, serve as a check and verification that the sorting process described earlier accurately delineated districts with weaker evaluation reports which differed in marked ways from districts with stronger evaluation reports. That is, the case which was “strongest of the weak” produced a profile much more like that of the weaker group than of the stronger, while the case which was “weakest of the strong” appeared more like the stronger group than the weaker. (Typical profiles will be presented later.) Perhaps coincidentally and perhaps not, the two groups nearest the middle of the 12 “exchanged positions” during the course of the study. This phenomenon will be discussed later. 2. It is important to note areas of kinship shared by all 12 districts studied. All 12 showed an interest in evaluation of gifted programs as indicated by their submission of evaluation reports and via willingness to participate in the interview process. This conclusion is substantiated by another commonality which is that all 12 districts did have some sort of plan to evaluate programs for the gifted. Thus while the reports and procedures are discussed in terms of “weak” and “strong,” it is likely that even the “weak” districts are ahead of the game in the evaluation of gifted programs when compared with many districts which have no systematic intent to evaluate or plan for doing so. 3. A third commonality to note was unexpected, and important to note. Our assumption was that districts using weaker evaluation practices would exhibit little, if any, use of evaluation information. In fact, however, all 12 districts used the information gathered through evaluation to bring about some level of change in programming. It cannot, therefore, be concluded that evaluation utility was absent in the weaker districts and present in the stronger ones. What the study

EVALUATION PRACTICE, 15(2), 1994

158

revealed was a continuum of evaluation processes and procedures, yielding a continuum of utility results (Tomlinson, Bland, & Callahan, 1992), and distinct profiles of stronger and weaker districts related to Joint Committee utility standards. Profiles of Weaker and Stronger Districts Factors along which continua of evaluation practice developed among the districts studied were:

1) purposes for evaluation (with stronger districts exhibiting more policy-driven 2) 3)

4)

5)

6)

7)

evaluations); methods of evaluation and data analysis (with stronger districts emphasizing both process and outcome data, a broader range of outcome documentation, both qualitative and quantitative data analysis, and more sophisticated data analysis); implementation plans (with stronger districts using more specific, multifaceted and institutionalized implementation processes and procedures to ensure use of results); evaluation reports (with stronger districts using a more formal reporting format which was written in varied forms based upon multiple needs of multiple audiences); participants in the evaluation process (with stronger districts using a greater number of human data sources, a greater variety of representatives from the data sources, and a greater variety of roles for participants in the evaluation process); qualifications of program personnel (with stronger districts more likely to involve a staff member or volunteer trained or experienced in gifted education or evaluation, and more likely to have cooperative working relationships between experts in the two fields); and nature of the change resulting from the evaluation process (with stronger districts tending to have results more focused on specific program elements rather than more general or global in nature) (see Figure 2).

Profiles of weaker and stronger districts presented here were amalgamated by abstracting characteristics from the continua in order to construct profiles of typical districts. Doing so enables comparison of the impact of the evaluation process in weaker and stronger settings. Quotations used in the amalgamated profiles are taken directly from interviews with informants in the weaker and stronger districts. Profile of a “Weaker” Evaluation Process. The coordinator of programs for the gifted in the school district may be new in her job, and the current program for gifted students may be new as well. She wants to know “whether the program works,” and in addition, she has a sense that she is accountable for what is happening in the program. This will require some sort of documentation, probably an evaluation. A procedure for evaluating will evolve, but not a strong policy to direct evaluation. “Lack of support and funding (for evaluation) are real problems.”

159

Evaluation Designs and Practices

Factors

District A: Weak Evaluation Practices

Evaluation Purpose

0

Method of Evaluation

l

0 0

new gifted coordinator needs to know how program is functioning likert scale questionnaire directed at parents measuring satisfaction with program

l

tally of responses

l

none exists

Evaluation Report

0

program description, questionnaire, and tally of responses

Stakeholder Participation

l

data sources (see methods section)

l

committee consists of gifted. coordinator, teachers of the gifted, building principals

Data Analysis

Implementation

Plan

District B: Strong Evaluation Practices district policy to evaluate all programs likert scale and open-ended questionnaire directed at parents, students, teachers, and administrators measuring satisfaction with program achievement test data from students focus group interviews with parents, students, and teachers analysis of program documents and curriculum descriptive statistics content analysis inferential statistics document analysis recommendations provided goal setting based upon recommendations board action policy development development of a plan procedures and resources for implementation timeline for tasks directions for further study based upon recommendations executive summary for teachers formal report for administrators and school board dissemination via newsletter to parents and other program stakeholders greater number of human data sources (see method and analysis section) representatives from data sources participate as a member of evaluation committee: parents, program teachers, regular classroom teachers, parents of students not enrolled in program, school board members, administrators, program coordinators

160

EVALUATION PRACTICE, 15(2), 1994

(conrinued) Dtitrict A: Weak Evaluation Practices

District B: Strong Evaluation Practices

Stakeholder Participation (continued)

l

roles: data source, survey design

l

Qualifications of Personnel

l

program coordinator has training or experience in gifted education

l

Factors

0 l

Nature of the Change

0

a

0

Figure 2.

Factors

evaluation goals not stated, l therefore changes not tied 0 to goals, “random” 0 nature of services was changed to better meet the needs of students additional program resour- l ces secured staff development provided to clarify misconceptions schedules and other prol gram elements were changed to assist general instruction in the school l information on the program provided to parents

roles for participants: data source, evaluation committee team member (including evaluation design, data collection, and dissemination), implementation team member (including planning, implementation, evaluation) staff member trained and experienced in gifted education staff member trained and experienced in evaluation cooperative relationship between the two fields goals are focused on specific program elements, “systematic” results tied to evaluation goals nature of services was changed to better meet the needs of identified students additional staff provided to assist with meeting needs of the students and the parameters of the new program initiatives additional resources provided by school board to enact those changes staff development implemented to prepare for change in the services provided

Delineating Profiles of Weaker Evaluation and Stronger Evaluation Practices

Practices

There seem to be two approaches to deciding what to do nextpither “doing what they did last year,” or “winging it.” Feeling that it would be better for several individuals to be involved in the process, the coordinator “forms a committee.” “Committee members include representatives of teachers of the gifted, coordinators, principals,” and perhaps parents or school board members. After several meetings with committee members, questionnaires are developed “to address concerns.” Most are Likert-type surveys “with a few open-ended questions.” It is perceived to be advantageous if the form is short and the questions few. “Questionnaires are distributed to cooperating teachers, students, and parents.” The coordinator herself distributes the surveys, collects them, and analyzes results by “tabulating frequencies and percentages, and noting every comment that was made.” Within a month or two of administering the survey, the coordinator shares “results with

Evaluation Designs and Practices

161

committee members for discussion about recommendations on program improvement or development. ” “The information is then shared with the superintendent who, in turn, informs the school board of the additional opportunities for students” which could come about as a result of the evaluation. Evaluation findings in a weaker district tend to be less directed by evaluation goals than to stem from more general evaluation focus. That is, the district generates a set of questions designed to “see what people are thinking,” and often uses the same set of questions year after year. Nonetheless, findings result in program change. “We saw that regular classroom teachers were unclear about goals and contents of our special classes for gifted learners, so we provided additional information for teachers and we made sure students in our resource rooms knew how to tell their regular classroom teachers what kinds of learning took place in there.” HAfterward, it appeared there was greater clarity among regular classroom teachers about the special classes.” “We found out from our surveys that our resource room schedule resulted in identified students reentering their regular classrooms at a point in the regular classroom schedule that interrupted instruction for their classmates and teachers and generally made things awkward for everybody.” This finding resulted in a shifting of the resource room schedule, “so that it more closely matched the regular classroom schedule.” “We could see that identified students continued receiving special services even in the absence of indicators that the students were achieving in the program.” As a result, “students now have to show a pattern of achievement to stay within the program.” Profile of a “Stronger” Evaluation Process

In this district, the coordinator of gifted programs has filled her current position for some time. She is aware of the political mandate for evaluation which exists in her district for programs for the gifted as it does “for all other programs with a curriculum.” There is a policy which both requires and supports evaluation. She also understands the power of evaluation to improve the program and “to build awareness of and support for what we are doing.” “We work hard to look at ourselves honestly,” she says. “We realize when we need to change, and that is healthy.” She is also aware of the role of the evaluation in the political process, “politically, evaluation findings allow support to be built for programs.” Here, evaluation is an ongoing and multifaceted process. “There is formative evaluation of everything specialists do in the classroom with general teachers.” “The teachers tell us what is working and what we can modify. In the process, they also come to understand our goals better too.” And there are feedback sheets on “how teachers feel about administration of the testing program we are in charge of” “to assist us with the management of testing.” “We are very diligent in following through with findings.““There is at least one kind of survey every semester-periodic surveys of building principals, students and teachers in that school.“There are “standard, self-monitoring devices in place in schools” and staff there with enduring responsibility for interpreting findings to building personnel as they relate to that school. There is a team of district professionals who can collaborate on evaluation procedures-at times members of the gifted education staff with strong credentials in evaluation as well, at times a partnership between a district evaluation department and members of the gifted education staff. In smaller districts, the expert in evaluation might be a teacher or a building level administrator with an advanced degree which led to training and/or experience in evaluation. Thus while one person assumes responsibility for the

162

EVALUATION PRACTICE, 15(2), 1994

evaluation process as it relates to gifted education, it is a leadership responsibility, and not sole responsibility. There is a steering committee for gifted programs which plays a key role in evaluation, but there are other groups and committees which are engaged in the process as well. “We don’t want to rely just on one source.” There is also a clear awareness of the varied stakeholders in the district. Stakeholders are a part of evaluation planning, execution, and followup. Stakeholder committees assist in determining specific program areas to be studied and to be able to propose questions whose answers could be valuable in providing program support. “We want them to have all the information they need. . .” “. . .to understand what we are about.. .” “to keep them apprised of findings so there are no surprises in the end.. .” “. . .so they will buy into the evaluation.. .” and “ . ..support program changes which follow.” When findings are generated, they are brought back to stakeholder committees “first orally, and then in preliminary reports. . .” “ . . .to give the stakeholders a chance to see whether the findings made sense and to determine if the recommendations are feasible.” In addition to process evaluation, the district employs outcome indicators. “The school board pays some attention to achievement data. ” “Recently, we conducted a panel study comparing test data for all students. In our self-contained program, all scores went up, which is amazing given the likelihood of regression to the mean. There was also strong evidence that these programs were benefiting minority achievement.” “We have begun using portfolios as a means of assessing the impact of the critical and creative thinking components in our program.” From time to time, external evaluations of the program are conducted. “There is a built-in suspicion that if the gifted/talented staff is conducting all the evaluations, they can’t be really legitimate. ” “A few years ago there was a huge external evaluation with university support to set a future direction for our gifted programs. The process was useful and we have built steadily on its findings.” Data analysis is done with appropriate technical support and qualitative and/or quantitative methods appropriate to the questions asked and evaluation formats used. A final, formal report is released, on a preset timeline, to appropriate groups including stakeholders, school board, staff, and frequently with report summaries available for news media and parent groups. The formal report is written in a format similar to that of a research study, with appropriate data tables and accompanying explanations. A standard part of the report is an implementation section, “outlining what is to be done as a result of the evaluation findings, who has oversight responsibility for the new plans, and a timeline for completion.” There is also a plan in place “to monitor next year how we’ve done with our commitment.” Program changes which result from evaluation are tied specifically to evaluation goals. A goal to examine a perception that a high dropout rate existed among gifted high school students led to data analysis indicating that the perception was inaccurate. As a result of the evaluation and data analysis, “information was provided to stakeholders to address the misperception.” Resulting from an evaluation goal to examine identification and participation of culturally diverse students in the program, “identification procedures were revised to facilitate proportional representation of minorities in programs for the gifted, and strategies were implemented to improve achievement for culturally diverse students.” An evaluation goal of assessing instructional effectiveness, indicated a need to provide support for teachers attempting to meet the needs of identified gifted learners. Thus, “a position was created to provide assistance to teachers.” While both “strong” and “weak”

EvaluationDesignsand

Practices

163

evaluations produced findings which were translated into positive program change, the way in which findings were generated and acted upon tended to be more linear or systematic than random. That is, a question is asked, an answer found, plans made for appropriate changes, policy-makers lobbied, staff training provided, and so forth. In the following evaluation, there is followup to determine the effectiveness of the change, and new questions are generated as appropriate. A Tale from the Blind Group

It was at least symbolic that the two districts directly in the middle of the ranking of the 12 “changed places” as the study unfolded. The one which had been ranked as “strongest among the weak,” had clearly moved up in the world since its original materials had been received. A new coordinator had come aboard-one who used terms like “portfolio assessment” and “outcome-based evaluation.” She was moving away from sole use of attitude surveys. “We need to look at performance and program benefits in achievement instead of just whether parents, students, and teachers like the program.” She has used the drawings of primary students to study attitude changes about science and scientists in youngsters who have participated in a magnet program where they work directly with scientists, compared with youngsters who have not had that opportunity. She is working to integrate some evaluation components of services for gifted learners into the evaluation processes of individual schools, and she talks about working with other administrators and board members, as well as use of evaluation data showing a gap “between predicted and actual test scores of gifted students for action at both local and state levels.” In the district which was initially classified as “weakest among the strong,” there was a clear backslide. In this setting, there had once been a coordinator of gifted programs who worked with a strong and knowledgeable planning committee on the districtmandated evaluation process. “Two people (building-level administrators) who worked on the committee had Ph.D.s (in evaluation), and the other (a teacher) was working on one.” “There were also consultants involved in developing the evaluation processes and procedures.” From both oral reports and evaluation documents, the evaluation system was judged to be effective. At some point, staff assignments changed, and the new coordinator inherited and elected to maintain the previous evaluation design. Talking about the evaluation plan, she explained that she “wasn’t quite sure how decisions were made regarding questions to be asked in the evaluation process. ” “The chief audience for the evaluation findings was the Gifted and Talented Planning Committee.” “ Principals were also given results of the evaluation by schools and helped to analyze them.” “Principals who had preconceptions probably didn’t change as a result of the meetings, but those who were open to suggestions and wanted to listen were helped to make changes.““Ultimately these meetings were instrumental in leading to a model shift in the program model used for the district’s gifted program.” “There was no systematic followup on these meetings to see whether plans had been executed.” At this point, the “new” coordinator has moved on. A new program has been put in place “based on evaluation findings.” The “school board has adopted the new program, but not funded it.” “There is no evaluation procedure in place for the new program, “ “ . . .and there is no staff to work on evaluation.““ Regular classroom teachers are supposed

EVALUATION

PRACTICE, 15(2), 1994

to assume responsibility for the (new) model as well as their own assignments. It makes their attitude toward the program negative. There is no acknowledgement of what they are doing.” Interviewees from this district suggested that the evaluation procedure had become institutionalized and potent during the tenure of the administrator who led in its development. Left to supervision by someone unskilled in programming for gifted learners, evaluation, and political savvy, the evaluation process continued somewhat like a driverless vehicle. It amassed information, but without leadership in developing implementation plans and interpreting the need for proposed changes to varied stakeholder groups, information was used to make decisions detrimental to participants in the program which the evaluation was designed to strengthen. A Cross-Group Comparison It appears that the great difference between those school districts categorized as having weaker evaluation reports and those having stronger ones lies in sharply contrasting levels in awareness of the need for evaluation and support for the evaluation process. There is the intent to evaluate and to do it to the best of one’s capacity in both settings-and, in fact, there are indications of success in both groups as measured by positive program changes which arise from evaluation findings. In the stronger settings, those in charge of the evaluation process understand the value of evaluation as a field of study, and evaluation is a prescribed feature of the planning cycle for all programs. Leaders may use vocabulary like “stakeholders, ” “formative evaluation,” “outcome indicators,” “chi-square.” They understand the peculiar pitfalls of measuring academic growth in students who top-out on tests, and can discuss the use of portfolios, comparison of achievement and aptitude scores, and regression to the mean. They have a level of political sophistication which helps them to see both a need and a means for building networks of support through evaluation processes for the programs which they administer. Further, they seek out technical and collegial support in the evaluation process, a reality which further enhances the range and potency of the evaluation. By contrast, coordinators in the districts categorized as weaker sense a need to know “how things are going,” and they use the only tool at their disposal-common sense. They work alone (or perceive that they do), and join forces with others via committee, gaining a sense of partnership, and feeling reinforced in their common sense strategies. Evaluation is seen as worthwhile, but proceeds more as a reactive than a proactive process. The study does not indicate that the number of personnel or fiscal resources are the sole determiners between “weak” and “strong” districts in regard to use of evaluation findings. In fact, both the “strong” and “weak” groups contained small school districts (with fewer administrative staff members) and larger ones (with more administrative staff members). Both categories also contained urban (relatively poorer) and suburban (relatively more affluent) districts. Stronger school districts tended to be those in which a leader in the evaluation process understood the need to involve persons with evaluation expertise and drew on that expertise through establishing partnerships with other administrators when possible, and through placing teachers and/or community members knowledgeable of evaluation on the steering committee when administrative partnerships were unavailable. It was not the case that stronger districts were strong in evaluation by virtue of fiscal superiority, but rather because a decision was made to allocate available fiscal and personnel resources to evaluation. In these districts, evaluation was an integral

Evaluation

Designsana’Practices

165

component of job descriptions of program coordinators, upper-level administrators conveyed both expectations for and support of evaluation to program coordinators, office support was provided for conducting evaluations and disseminating findings, and there was an institutional expectation that evaluation would result in positive program development. In districts using weaker evaluation practices, there was little or no apparent institutional valuing of evaluation. Rather, evaluation was more likely perceived as a process to be feared because it might cause problems for the programs being evaluated. Not surprisingly, therefore, substantial support for evaluation (in financing, encouragement, time, personnel) was lacking in the weaker districts. Decisive Factors in Use of Findings This study indicates two key factors which promote use of evaluation findings in districts studied-will and skill. It appears that the will to evaluate on the part of some key personnel in a district, supplemented with systematic procedures and resources for doing so, results in generation of evaluation findings and translation of those findings into program change. This will to evaluate existed in both the weaker and stronger districts studied. The second factor-skill in evaluation and related processes-appears to be the demarcation between the two categories of districts and affects the robustness of program change stemming from evaluation findings. Utilization appeared more likely and changes from the findings more potent and systemic in direct relationship to the following conditions: 1. 2. 3. 4. 5.

Evaluation of programs for the gifted was a part of a district-wide policy requiring routine evaluation for all program areas. Systematic written plans were in place delineating steps and procedures for ensuring implementation of findings. Multiple stakeholders were consistently involved in planning, monitoring, and reviewing the evaluation process and its findings. Stakeholders played an active role in planning for and advocating before policy makers for program change based on evaluation findings. Key program personnel were knowledgeable about gifted education, evaluation, the political processes in their districts, and the interconnectedness of the three. In instances where program administrators with expertise in both gifted education and evaluation were not available, leaders involved volunteer steering committee members with such expertise. DISCUSSION AND RECOMMENDATIONS

The study reported is qualitative and thus does not claim broad genera&ability. Nonetheless, findings from these cases should be useful for extrapolation in informing further research about evaluation utility and in examining evaluation utility in a variety of settings, including but not limited to programs for gifted learners (Cronbach, Ambron, Dornbusch, Hess, Hornik, Phillips, Walker, & Weiner, 1980). The cases studied support key assumptions in the theoretical literature regarding factors affecting evaluation utilization, finding that the Joint Committee Standards (198 1)

EVALUATION

PRACTICE, 15(2), 1994

were more closely adhered to in the school districts with stronger evaluation reports. The study does not support earlier findings from our literature review (Tomlinson, et al., 1993) that stakeholders are less likely to agree with evaluation reports which they believe are written by female evaluators and which are written by researchers as opposed to evaluators or content specialists (Braskamp, Brown, & Newman, 1981). Evaluations in stronger and weaker districts were conducted, written, and presented by both males and females. The cases also indicate that where intent to evaluate programs for the gifted exists, some form of evaluation is likely to evolve. Even when such evaluation schemes are relatively weaker, at least in comparison to evaluation reports that closely follow utility standards such as those developed by the Joint Committee (198 l), utilization of evaluation findings can and does occur in ways that result in positive program change, at least in the short-term. It appears a reasonable hypothesis that, over time, program quality and support would be more positively affected through use of systematic evaluation processes and procedures than through use of random evaluation processes and procedures. Additional study is necessary to make that determination. Generally more robust evaluation designs and procedures appear to evolve when responsible personnel have specific training in evaluation, in gifted education, and in problems of evaluating programs for the gifted (e.g., evaluation challenges created by programs which are long-term, individualized, complex, and poorly measured by traditional standardized means (Tomlinson et al., 1993)), and/or when they have support in the way of policy expectations and well-trained colleagues or volunteers. Such program personnel thus have access to vocabulary, procedures, and a level of political sophistication which enable them to maximize the capacity of evaluation both to chart program growth and amass program support, including economic support for the gifted program. Datta (1989) points to lack of economic resources as a major factor in reduced evaluation efforts in education. The current study indicates that while such resources facilitate evaluation of programs for the gifted, they are not a sole determiner of evaluation effectiveness or evaluation utility in a district, The clearest need emerging from the study is for training of program personnel in evaluation. Further, there is a need to appropriately apply a full range of evaluation methodology to the problems presented in assessment of student growth in programs for the gifted. Even many of the “strong” districts showed only fledgling movement in the direction of experimental design to demonstrate student growth (Beggs, Mouw, & Barton, 1989; Callahan, 1983; Carter, 1986; Payne & Brown, 1982), and few appear to have tapped the range of possibilities of qualitative design for evaluating programs for the gifted (Janesick, 1989; Lundsteen, 1987). Certainly the weaker districts have need for personnel or volunteers with knowledge of the value of evaluation and how to employ varied data collection modes (Gilberg, 1983; Janesick, 1989; Rimm, 1982); how to address concerns of both internal and external audiences by asking questions which are relevant, useful, and important and which will thus directly facilitate positive and powerful decision-making (Callahan, 1986); how to identify decision-makers at various levels as well as actions over which they have control (Callahan, 1986; Dettmer, 1985; Gilberg, 1983; Renzulli, 1984; and Rimm, 1982); and how to find out what course of action will result from data supplied as well as how to make recommendations with an eye toward program improvement (Gilberg, 1983). To function at a lesser state is to compromise the positive possibilities of evaluation.

167

EvaluationDesigns and Pmctices ACKNOWLEDGMENTS

The work reported herein was sponsored by The National Research Center on the Gifted and Talented under the Jacob K. Javits Gifted and Talented Students Education Act (Grant No. R206ROOOOl)and administered by the Office of Educational Research and Improvement and the United States Department of Education. REFERENCES

Alkin, M. (1980). Uses and users of evaluation. In E.L. Baker (Ed.), Evaluating federal education programs (Report No. CSE-R-153; ERIC No. ED 205 599) (pp. 39-52). Los Angeles, CA: University of California at Los Angeles, Center for the Study of Evaluation. Beggs, D., Mouw, J., & Barton, J. (1989). Evaluating gifted programs: Documenting individual and programatic outcomes. Roeper Review, 12, 73-76. Bissel, J. (1979). Program impact evaluations: An introduction for managers of Title Vllprojects. A draft guidebook (ERIC No. ED 209 301). Los Alamitos, CA: Southwest Regional Laboratory for Educational Research and Development. Braskamp, L., Brown, R., Kc Newman, D. (1981). Studying evaluation utilization through simulations. Evaluation Review, 6, 114-126. Callahan, C. (1983). Issues in evaluating programs for the gifted. Gifted Child Quarterly, 27, 37. Callahan, C. (1986). Asking the right questions: The central issue in evaluating programs for the gifted and talented. Gifted Child Quarterly, 30, 38-42. Carter, K. (1986). Evaluation design: Issues confronting evaluators of gifted programs. Gifted Child Quarterly, 30, 88-92. Cronbach, L., Ambron, S., Dombusch, S., Hess, R., Hornik, R., Phillips, D., Walker, D., Weiner, S. (1980). Toward reform of program evaluation. San Fransisco, CA: Jossey-Bass. Datta, L. (1979). 0 thou that bringest the tidings to lions: Reporting the findings of educational evaluations. Paper presented at the annual Johns Hopkins University National Symposium on Educational Research, Baltimore, MD. Datta, L. (1989). Education information: Production and quality deserve increased attention, statement before the Subcommittee on Government Information and Regulation. Committee on Government Affairs, U.S. Senate. Washington, DC: General Accounting Office. Dawson, J., & D’Amico, J. (1985). Involving program staff in evaluation studies: A strategy for increasing information use and enriching the data base. Evaluation Review, 9, 173-88. Dettmer, P. (1985). Gifted program scope, structure and evaluation. Roeper Review, 7, 146-152. Gilberg, J. (1983). Formative evaluation of gifted and talented programs. Roeper Review, 6, 4344. Hunsaker, S., & Callahan, C. (1993). Evaluation of gifted programs: Current practice. Journalfor the Education of the Gifted, 16,190-200. Janesick, V. (1989). Stages of developing a qualitative evaluation plan for a regional high school of excellence in upstate New York. Paper presented at the American Evaluation Association, San Francisco, CA. Joint Committee on Standards for Educational Evaluation. (198 1). Standards for evaluations of educationalprograms, projects, and materials. New York: McGraw-Hill. King, J., Thompson, B., & Pechman, E. (1981). Improving evaluation use in local school settings (NIE-G-80-0082; ERIC No.ED 214 998). Washington, DC: National Institute of Education. Lundsteen, S. (1987). Qualitative assessment of gifted education. Gifted Child Quurterly, 31, 2529.

168

EVALUATION PRACTICE, 15(2), 1994

Mathis, W. (1980, April). Evaluating: The policy implications. Paper presented at the American Educational Research Association, Boston, MA. Mutschler, E. (1984). Evaluating practice: A study of research utilization by practitioners. Social Work, 29,332-37. Ostrander, S., Goldstein, P., & Hull, D. (1978). Toward overcoming problems in evaluation research: A beginning perspective on power. Evaluation and Program Planning, I, 187-193. Payne, D., & Brown, C. (1982). The use and abuse of control groups in program evaluation. Roeper Review, 5, 11-14. Raizen, S., & Rossi, P. (Eds.). (1981). Program evaluation in education: When? How? To what ends?(Report No. ISBN-O-309-03143-5; ERIC No. ED 205 614). Washington, DC: National Academy Press. Renzulli, J. (1984). Evaluating programs for the gifted: Four questions about the larger issues. Gifted Education International, 2, 83-87. Rimm, S. (1982). Evaluation of gifted programs: As easy as ABC. Roeper Review, 5, 8-l 1. Smith, N. (1981). Evaluation studies: Evaluating evaluation methods. Studies in Educational Evaluation, 7, 173-81. Tomlinson, C., Bland, L., 8t Callahan, C. (1992, November). Use of evaluationjindings inprograms for the gifted. Paper presented at the meeting of the National Association for Gifted Children, Los Angeles. Tomlinson, C., Bland, L., & Moon, T. (1993). Evaluation utilization: A review of the literature with implications for gifted education. Journalfor the Education of the Gifred, 16, 171-189.