Some observations about the institutionalization of evaluation

Some observations about the institutionalization of evaluation

Articles Some Observations About the Institutionalization of Evaluation BLAINE R. WORTHEN INTRODUCTION The thesis of this paper is straightforward. ...

638KB Sizes 5 Downloads 78 Views

Articles

Some Observations About the Institutionalization of Evaluation BLAINE R. WORTHEN

INTRODUCTION The thesis of this paper is straightforward. Evaluation has become institutionalized at state and local levels to a much greater degree than is commonly understood. To support that assertion, two interrelated topics are discussed in this paper. First, trends across the last 30 years in state and local level program evaluation are discussed, with special attention to (a) their ebb and flow as they relate to federal evaluation mandates,

and (b) whether these trends have indeed resulted in the locus for initiating evaluations shifting away from federal agencies to state and local levels. Second, evidence is presented that suggests that evaluation plays an increasingly important informational role as the level of the evaluation becomes more local, while evaluations at national levels typically continue to serve symbolic, noninformational functions. The conclusions offered depend as much on experience and observation over several decades of evaluation practice as they do on empirical evidence. Also, the inferences and data provided are drawn primarily from education, the field I know best. From frequent forays further afield to do evaluations or meta-evaluations for social service agencies, municipalities, foundations, corporations, and even state highway departments, I expect the trends and conclusions discussed below will generalize beyond education to many other state and local agencies, both public and private. But it is left to the reader to make such extrapolations, to correct any erroneous conclusions, or to provide additional data on how well those conclusions apply in other contexts. TRENDS IN THE LEVEL AT WHICH EVALUATION

IS INITIATED

When I made my first foray into educational evaluation some 26 years ago, that threeyear-old field was growing with all the energy-and innocence-of a rambunctious toddler. Blaine R. Worthen . Professor and Chair, Research & Evaluation State University, Logan, UT 84322-2810. Evaluation

Methodology

Program,

Department

Copyright

Practice, Vol. 16, No. I, 1995, pp. 29-36.

29

Utah

@ 1995 by JAI Press, Inc.

All rights of reproduction

ISSN: 0886-1633

of Psychology,

in any form reserved.

30

EVALUATION PRACTICE, 16(l), 1995

Though destined to be shaped and reshaped by its environment over the next decades, its original parentage was clear. Although a few thoughtful analysts (e.g., Cronbach, 1963) had written about evaluation before 1965, it still existed, at best, as a gleam in the eye of a small number of educators. Evaluation as a field was spawned and nurtured during its youth by the evaluation mandates built into the Elementary and Secondary Education Act (ESEA) of 1965. Those old enough to remember that period will recall when the term “educational evaluation” was synonymous with the requirement that educators be accountable for federal funds received under the ESEA’s Title I and Title III. Although theorists began almost immediately to suggest approaches for evaluating general school curricula and programs, the actual practice of program evaluation remained for at least a decade focused on evaluating discrete federally funded projects or federal programs consisting of aggregations of local projects. During this period (1965-1975), in which evaluation was growing to adolescence, both its conceptual and structural development were influenced by its origins. Its theories and philosophy, its proposed designs and instruments, and even its training of personnel were tilted more toward evaluating discrete, specially funded projects than toward evaluation of broad outcomes of curriculum and instruction (Worthen & Sanders, 1991). Given such origins, it is not surprising that most evaluators practicing between 1965 and 1975 accumulated more experience in evaluation of specially funded education projects than any other entity. During the early 1970s however, evaluation problems and needs quickly outstripped the somewhat simplistic and monolithic solutions that emerged during evaluation’s childhood. The later 1970s saw a field in which diversification, pluralism, and multiple conceptualizations of evaluation proliferated. Increased involvement of evaluators from fields outside of education produced a wider variety of evaluation methods and techniques, many of which were better suited to evaluating educational systems and programs more broadly than had been the case when evaluation methods appeared to be tailored for Title 1 or Title IV’ projects. Evaluation was a hustling, thriving enterprise, and both programs for training evaluators and positions for evaluators were burgeoning. New professional associations for evaluation (including those that were precursors of AEA) arose, and the field of evaluation appeared to be on its way to becoming a full-blown profession, complete with its own journals, standards, and professional reference groups (House, 1990; Patton, 1988; Worthen, 1994; Worthen & Sanders, 1991). Then Ronald Reagan’s shadow fell over the field. Federal evaluation mandates were quietly remanded, as the “new federalism” resulted in simultaneous reduction in federal funding for education and federal control over how states and school districts expended the federal funds they received. Titles I and IV were replaced by the 1981 Education Consolidation and Improvement Act’s Chapter 1 and other block grants to states. These grants were largely devoid of evaluation requirements. Common wisdom of most analysts during the early 1980s was that state and local education agencies, hard pressed for operational funds for their schools, would use the monies from these grants to buy chalk, patch broken windows, or put another person on drug patrol in the schoolyard, rather than to continue the potentially prickly practice of evaluation now that federal mandates had been lifted. Evaluation was privately predicted by many evaluators to be one of the major casualties of the Reagan administration, for the thought that school districts and state offices of education would continue evaluations without federal mandates seemed highly improbable to those who had seen little or no prior evidence of such a tendency.

31

institutionalization of Evaluation

By 1982, these pessimistic prophecies seemed to many observers to be proving accurate. Governmental monitoring of categorical funding programs was drastically reduced. Individual evaluators and evaluation job shops accustomed to depending on contracted evaluations of federal programs for their salaries or consulting income found this source drying up. Before long, despondency and pessimism began to spread across evaluation’s landscape, and the early and middle 1980s were marked by a series of depressing and introspective evaluation conference sessions that focused on the perceived decline of evaluation. Some evaluation trainers at universities began to question the ethics of training evaluation neophytes for roles thought to be in diminishing demand. For a time, it appeared to many observers that evaluation’s bubble had burst, and program evaluation seemed destined to be relegated to the graveyard of promising societal endeavors that had failed to fulfill their potential. Strangely, this pessimistic view was slow to reach the Rocky Mountain region where several of my evaluation colleagues and I most frequently ply our trade. Indeed, the early 1980s had brought us a stronger surge of evaluation business than ever before. School districts and state agencies faced with shrinking resources seemed to be asking for more, not less, evaluation to help them find answers. Moreover, we discovered that grass-roots evaluation was also expanding in many other local and state education agencies elsewhere across the country. Against that backdrop, I found it difficult to fathom why so many of our colleagues were uttering pessimistic prognostications that were funereal in tone. In my mind, evaluators were in a situation similar to that of Mark Twain, who upon reading his own erroneously printed obituary, sent the editor his famous retort, “The rumors of my death are greatly exaggerated.” Perhaps recounting one event may better illustrate the point. In 1985, I was invited to a “woe is me” evaluation meeting at which all the presentations were to address issues surrounding “The Death of Evaluation.” My colleague, Jim Sanders, and I had both wondered why some well-known evaluation practitioners were lamenting the fact that their evaluation contracts had declined dramatically, even while our contracts and those of other evaluators were doubling or tripling each year. Then, with this conference’s pessimistic presentations, . . . the obvious answer came. Our disgruntled colleagues had been existing primarily on federal evaluation contracts, which had been cut drastically. We, however, had for years served primarily local and state education agencies, universities, foundations, and an occasional business or industrial concern. Two realizations occurred to us. First, our colleagues had viewed evaluation as drying up only because they had thrust their snouts so deeply in the federal trough that they had come to view it as the only watering hole around, overlooking all the other educational streams and ponds that provided sustenance for evaluators. And that was the second realization-the federal cutback in evaluation had little effect on the appetite for evaluation in local and state educational agencies, foundations, and the like. When we counted, we found that nearly 90 percent of the evaluation work we were doing wasn’t mandated by anyone; the agencies had simply recognized the need and built evaluation into their budgets without outside compulsion. They had come to believe in evaluation, even without federal evaluation mandates. Evaluation had become institutionalized. (Worthen & Sanders, 1991, p. 7) What

we said then seems

patently

more

true today.

32

EVALUATION PRACTICE, 16(l), 1995 TABLE 1 Externally Required vs. Self-Initiated Educational Evaluations Conducted by One Evaluation Contracting Agency: By Time Periods Time Periods

Locus of Decision

1973-1982

to Evaluate

N

70

N

YO

N

70

N

%

External Self-Initiated

29 10

74* 26

3 10

23 77

3 57

5 95

35 77

31 69

Totals

39

13

100

60

100

112

100

Now

*

100

1982-85

1985-94

Totals

Percentages in this table total by columns only.

TREND

DATA

FROM ONE EVALUATION

CONTRACTING

AGENCY

So far the discussion has been more impressionistic than empirical, but those impressions have been forged in the crucible of conducting contracted evaluations across nearly three decades. Recently I examined data drawn from the institutional experiences of the evaluation contract agency with which I have been associated for the past 21 years. Although the experience of one “evaluation shop” is not a very broad sample of agencies, it does reflect nearly 200 evaluation contracts in 38 states, and 112 of these were evaluations of educational programs. In the analysis summarized in Table 1, all the educational evaluation contracts conducted by the Western Institute for Research and Evaluation (WIRE)* during the past 21 years were categorized on two dimensions. The first was whether the evaluation was (a) required as a condition of funding received by the agency operating the evaluand,3 or (b) initiated by that agency without any external evaluation mandate.4 The second dimension was whether the evaluation was conducted (a) before 1982 (the year when the federal shift away from educational program evaluation mandates began to impact on the field), (b) between 1982 and 1985 (the year by which most federal evaluation mandates had been phased out), or (c) the relatively mandate-free period after 1985. The results are instructive. Obviously one could argue that the divisions between time periods could be adjusted by a year or so, but it would not significantly alter the overwhelming trend in this summary of WIRE’s evaluations. While a great majority of the evaluation studies conducted prior to 1982 were required to comply with external mandates, since that time the great majority of evaluation studies have been self-initiated by local or state agencies without any external mandate or requirement. In short, the withdrawal of federal evaluation mandates seems not to have reduced the average number of evaluation studies WIRE has conducted each year so much as it has significantly shifted the locus of initiating such studies to the state and local levels. Obviously, some federal evaluation requirements remain, but the focus of federal evaluation expenditures has largely shifted away from doing evaluations, while maintaining or expanding mechanisms for providing technical assistance aimed at helping local and state agencies to carry out evaluation functions (e.g., the federally funded centers that exist to provide state and local education agencies with technical assistance in Chapter 1 evaluation).

institutionalization of Evaluation

33

TABLE 2 Relationship of Externally Required and Self-Initiated Educational Evaluations to the Type of Funding for the Program Evaluated Types of Funding Special Purpose

Locus of Decision to Evaluate

General Budget

Totals

N

YO

N

70

N

External Self-Initiated

35 32

31* 29

0 45

0 40

35 II

31 69

Totals

67

60

45

40

112

100

Note:

* All percentages

in this table are percentages

YO

of the 112 total studies.

It is also revealing to reclassify these same 112 educational evaluation studies in terms of whether the program or project evaluated was supported by (a) specialpurpose, targeted, categorical funding (either federal or state), or (b) general, regular funding for educational programs. An example of special purpose funding would be a state legislature’s allocation of funds to enable individual school districts to experiment with year-round school programs, following prescribed guidelines as to how those funds could be used. In contrast,

an example of general, regular funding would be the general operational budget of a public school district, provided by the state on the basis of the district’s average daily attendance (or other formula). For this analysis, as for that shown in Table 1, any study required by any state or federal legislation, or prompted by any mandate from a higher level of governance was counted as being “externally required.” All the externally required evaluations conducted by WIRE dealt with special-purpose programs, while no general education programs were evaluated because of an external accountability requirement (Table 2). The state and local education agencies initiated evaluations of programs supported by general operational budgets at least as often as they initiated evaluations of special purpose programs supported by categorical funding. This suggests that these agencies are increasingly using evaluations for their own purposes, without needing external mandates or even special funding allocations to prompt them to do so, and points to a trend toward institutionalizing the evaluation function in local and state agencies, a trend so strong that it seems unlikely to be reversed in the foreseeable future. TRENDS IN THE ROLES EVALUATION PLAYS AT VARIOUS GOVERNMENTAL LEVELS Few evaluations escape the influence of political forces. As Suarez (1990) has noted, few federal or national evaluations have as their central goal the provision of impartial information that will inform and shape national policies and decisions. The political stakes are simply so high at the federal level that advocacy eventually inundates those who retreat to the high ground in search of valid information for use in rational decision making. Nowhere are the conclusions of Cronbach, Ambron, Dronbusch, Hess, Hornik, Phillips, Walker, and Weiner (1980), that evaluation is essentially a political activity, more germane than with evaluations of federal or national programs.

34

EVALUATION PRACTICE, 16(l), 1995

When one examines the trends in the use of evaluation at state and local levels, however, there is much more room for optimism. Not that politics is absent from local and state governmental processes. Such a claim would be naive, especially when highstakes decisions are in the offing. But despite the pervasive presence of politics at every level, political processes at state and local levels often seem to permit more demonstrableand rational-use of evaluation data than is true at the national level. In one sense, that may seem surprising, for state and local evaluation began largely as responses to federal mandates. As such, most state and local evaluation studies of Title I and III programs, for example, began as naked examples of compliance monitoring, with little or no real use made of the individual project data or data accumulated acr0.r~ projects. Using Cronbach et al.‘s (1980) distinction, such evaluations primarily served noninformational purposes analogous to the impact that police officers who patrol the highway in plainly marked patrol cars have on drivers’ observance of speed limits. Informational use of data from these early evaluations-that is, use to shape local or state decisions about their Title I or III projects-was relatively infrequent (Worthen, 1967). Somewhere, somehow (and perhaps only the most sagacious historical analyst could determine all the causes), the situation changed and state and local decision makers began to use evaluation to provide information they wanted, not because they were forced to, but apparently because they believed the evaluation data would be helpful. Although the federal threat of accountability no longer hung over them, it was as if the inquiry processes used in federal compliance evaluations had left a legacy of introspection and interest in data, quite unlike the data-free decision making that had typified so many prior decades of educational decision making. Again, an analysis of the WIRE educational evaluation studies along these dimensions is instructive. All 112 educational evaluations were recategorized on two dimensions: 1)on the basis of whether the evaluand was a local, state, or federal program (note that this pertains only to the level of program administration, not funding); and 2)by whether the primary use made of the evaluation findings was (a) to provide information intended to improve the program (informational), or (b) to satisfy some formal requirement that an evaluation be conducted, with no real use ever made of the data (noninformational). The results are shown in Table 3. Although political forces will forever create waves in education’s federal waterways, these data suggest that the political sailing is smoother on education’s state and local tributaries, where the waters are seldom stirred up by unwanted evaluation mandates. While politically roiled programs can occur at any level, there are at least large portions of the local and state programs that are free enough of political undercurrents to allow policy makers to use program evaluation to gather information they need to guide future program development. Some 85 percent of WIRE’s studies of national educational programs ended up producing only noninformational results (e.g., they were used only to satisfy an evaluation mandate), rather than informational results (e.g., “The evaluation has revealed the following program deficiencies.. .‘3. Conversely, the results of only 3 1 percent of WIRE’s local and 22 percent of their state level evaluations of education programs were used for noninformational purposes. Even though potentially useful information was produced in all these evaluations, often in response to similar evaluation questions, the information was used to modify the program under study in dramatically more state and local studies (78% and 69% respectively) than was true for federally or nationally initiated evaluations, where only 15 percent of the study results were actually used in program improvement efforts.

institutionalizationof Evaluation

35

TABLE 3 Relationship of Federal/National, State, and Locally Initiated Evaluations to Informational and Noninformational Uses of Evaluation Uses of Evaluation Data L.evel of Program Administration

Informational N

Federal/ National State Local Totals Nope:

Noninformational

Totals

%

N

70

2 28 41

15* 78 69

11 8 18

85* 22 31

13 36 59

100* 100 100

71

66

37

34

108**

100

* Percentages in this table total by row only. ** Of 112 studies used in earlier tables, the use made of evaluation four studies.

N

YO

data could not be determined

THE INSTITUTIONALIZATION OF EDUCATIONAL IMPLICATIONS FOR PRACTICE

for

EVALUATION:

Far from lamenting the (nonexistent) decline of educational program evaluation, evaluators should be celebrating the healthy trend toward evaluation being used for genuine informational purposes. Increasingly, many such agencies are initiating evaluations of programs without any external compulsion to do so. While strong local political forces may well result in evaluation playing only a symbolic role in particular instances, enough local and state programs appear to be sufficiently insulated from politics to allow evaluation to play an informational purpose seldom possible at federal levels. If these conclusions are supported by observations and analyses made by fellow evaluation practitioners, then there are several obvious implications for practice. 1.

2.

3.

Evaluators need not pursue the relatively few federal evaluation contracts that remain to have challenging and fulfilling evaluation careers. The number of evaluation consultants and contracting firms that primarily serve local and/or state agencies underscore the need for competent practitioners at this level. When evaluations are initiated by an agency’s need for information, rather than by external mandate, the evaluator is better able to tailor the study to answer the evaluation questions of the client and other relevant stakeholders, rather than pursuing those specified by legislatures and governing boards. Greater challenge and creativity exist in crafting an evaluation to be useful to a program’s stakeholders than in stamping out the “evaluation compliance” clones so often found in evaluations carried out to meet mandates. Local and state level evaluations are generally closer to the heart of educational program staff and stakeholders than is true of evaluation that educators believe has been foisted off on them by higher level mandates. For the practicing evaluator, this means that raw technical competence must be buffered by sensitivity to the fact that truth must be told with tenderness and in a context of trust. While obviously important at all evaluation levels, the ability to build interpersonal rapport with clients is crucial in local and state evaluation studies.

EVALUATION

36

PRACTICE,

16(l), 1995

If evaluators can successfully meet challenges such as these, the institutionalization of evaluation-and its practice-will be enhanced even more. Even so, there is still a long way to go if evaluation is to fulfill its potential of providing valid information to support the major educational decisions confronting local and state decision makers. Evaluation must become a still more durable and permanent function in schools and state education agencies. The result of full acceptance and ongoing use of evaluation-that is, full institutionalization of evaluation-would seem an essential precursor to the meaningful educational reforms that are so widely sought today.

NOTES

1. By this time, Title I11 had been renamed as Title IV. 2. WIRE began in 1969 under the name of Evaluation Associates but was changed in 1979 to avoid confusion with another agency discovered to have the same name. 3. “Evaluand” refers to whatever is being evaluated. 4. Programs that received no special evaluation funding but were required to do an evaluation for accountability to some external governing body (e.g., a state legislature) were counted in the first category, as a required evaluation.

REFERENCES

Cronbach, L. J. (1963). Course improvement through evaluation. Teachers College Record, 64,672683. Cronbach, L. J., Ambron, S. R., Dornbusch, S. M., Hess, R. D., Hornik, R. C., Phillips, D. C., Walker, D. F., & Weiner, S. S. (1980). Toward reform ofprogram evaluation. San Francisco: Jossey-Bass. House, E. R. (1990). Trends in evaluation. Educational Researcher, Z9(3), 24-28. Patton, M. Q. (1988). The evaluator’s responsibility for utilization. Evaluation Practice, 9(2), 5-24. Suarez, T. (1990, October). Living with the mixed message: The effect of government-sponsored evaluation requirements on the practice of evaluation. In B. R. Worthen (Chair), Informational and non-informational uses of evaluation by governmental agencies. Symposium conducted at the meeting of the American Evaluation Association Annual Meeting. Washington, D.C. Worthen, B. R. (1967). The evolution of Title I-III: A study in change. The0r.y Into Practice, 6(3), 104-111. Worthen, B. R. (1994). Is evaluation a mature profession that warrants the preparation of evaluation professionals? In Altschuld, J. W. & Engle, M. (Eds.), i?he preparation of professional evaluators: Issues, perspectives, andprograms. San Francisco, CA: Jossey-Bass. Worthen, B. R., & Sanders, J. R. (1991). The changing face of educational evaluation. Theory Into Practice, 30(I), 3-12.