Current
Literature
217
Evaluating Federal Social Programs: An Uncertain Art R! Sar A. Levitan and Gregor\, K. \Vurzburg Kaiamazoo. Ylichigan: The 1$‘.E. Upjohn Institute for Employment Research. 1979. 14s pp. (\$.ith index) This monograph sharpl!~ highlights the confusion that exists in the minds of some social scientists, government administrators and iegisiators and the public at large. concerning the roie. methods and limitations of the evaluation of social programs.* ~~!ul~~~~i~~~F&w! Soczai Program<. by Sar Levitan and Gregorx The first two and the last deal T\‘urzburg, contains five chapters. scith a critique of social science methods per se. Chapters 3 and 4 deal \z.ith the actuai practice of e\xIuation in gosernment. Taken together, Chapters 1 . 2 and ;i comprise an elxluation of the practical relevance of social science as an aid to social planning and decision making in the public sector. These three chapters r+.iil be brietl! described first. In them. the prognosis as to the utility of such science is generally not positive. The first chapter, “The .Aspirations argues forcefuli\ that the evaluation and Limitations of Evaluators.” of social programs is flawed and distorted b\- a slavish devotion to methodological rigor and narrotcness of approach. Instead. the authors assert. “a holistic and judgmental approach“ should guide the evaluation of social programs. iz nevs balance should be struck in social science e\xiuation betfceen subjective judgment and empirical observation on the one hand and modeling on the other such that more emphasis is placed on the former. core of Chapter 2. “The Tools of the Trade” is the conceptual the book. It discusses the relationship bet\$,een the analysis of a program’s operational structure and its deii\yery of services to clients (process analysis) and the anai!sis of the ultimate effects of the services on the clients (impact anal\&). The necessity for properI\ identifying program objectives is aiso discussed as \t,eli as the need to devise -accurate indicatk to measure these objecti! es. and the need for proper control groups. It ends \z.ith a general criticism of ciassicai experimentation. draxing upon the New Jerse\ Income SIaintenance Esperimenr. among others, IC) demonstrate the difficulties of operating and evaluating a social esperiment in a real world context. The tone of this chapter is cautiowr! and critical. Chapter 5. “Can focuses on the nred for program E\.aluation blake a Difference?“. evaluation but again restates its potential dangers and limitations. The limited effects of evaluation in influencing program design and *Tim review haz benefited ft-urn the criticism and sugyxtions of se\eraJ of rn\ colleagues both ar the Department of Labor and elsewhere. Ho\rwer. all opinions and an\ ex-ors of fact
Current Literature
218
revision are noted. as r\.ell as the perverse way in IL.hich different policy conclusions can be drawn from a given set of data by persons of divergent policai focus. Chapter 3 deals with “Evaluation in the Legislative Branch.” It discusses the rationale and focus of evaluation in the legislative branch through the general institution of congressional oversight and notes the strengths and weaknesses of this institutional approach. Then the Congressional Budget Office, the General ACcounting Office and Congressional Research Service are discussed in terms of their Legislative origins and mandate. administrative structure, the nature of evaluation and research performed and its relative strengths and weaknesses. As such. this chapter is reiativeiv complete in its broad outline of evaluation as done by the legislative branch. The criticisms and discussion are generally well taken. Chapter 4, “Evaluation in the Administrative Branch.” is much less complete since it effectively deals with evaluation only in the Department of Labor and the Department of Health, Education, and Welfare. Other social agencies. such as the Department of Housing and Urban Development are not considered though major evaluations have been performed there, such as the Housing Allowance Demand Experiment. Indeed. the authors mainly concentrate on two major agencies in the two departments studied-the Office of the Assistant Secretary for Policy Evaluation. and Research (.XSPER) in the Department of Labor and the Office of the Assistant Secretary for Planning and Evaluation (ASPE) in the Department of Health. Education, and Welfare. Special sections in the chapter discuss the problems of competitive bidding and the issue of the inteflectual and scientific independence of evaiuations and evaluators due to the nature of the poiiticaf and institutional environment surrounding social program evaluation. General Critique This study is useful in that it gives interested students a particular view of the problem and issues sL~rrounding the use of social science in analyzing and informing social policy. However. the book has several limitations. The first limitation is partly a matter of style. The authors have a penchant for polemics and hyperbole. For instance, academics and technicians “search for the Holy Grail;” program planning and budgeting systems were “foisted” on federal agencies; some methodological tools border on “witchcraft;” choosing kev variables for analysis is iike “shooting in the dark;” and, scholarly criticism of rhe
Current Literature
219
authors’ I\-ark is characterized as “attacks.” Such expression detracts from scholarly purpose. I\Iore important is the apprehension one gets that the authors do not understand or appreciate the shortcomings to \\.hich social science methods are generaIl!subject. They confuse appearance model manipulators” but with substance. The\- castigate “meddling fail to understand that models and assumptions. \\-hether implicit 01 explicit. underlie all social science including the “holistic and judgmental” approach they prefer. Thi-ouqhout the book the authors reserl e considerable c;iticism for “modeling.” \lodels are asserted to provide only partial and often disconnected vie\\.s of‘comples. tnultisuch critidimensional programs. Of course, in an\ gilen situation. cisms can he true. But the\. are not generalI\ true of the method of and rheorizing--\vhich characterizes modeling-that is. abstracting all analixis and is part of the basis of the scientific method. .As presented. the authors’ criticisms are mainI\. pertinent to the relati1.e skills of the scientist. Poor anal\-sts do poor science. mathematical, judgmental or otherlrise. A model is simply a !+.a? of organizing the salient facts about some important aspect of hutnan behavior. It is best that the model (or models) be made explicit. l\lodeling is an intellectual method. The fact that mathematics can be used to express the structure of a model is incidental. Done correctly. modeling the operation and the mutual influences of a program on human betia\.ior is a \.er-!. effective \ca> to arri\.e at rele\.atlt. usable, and \el-ifiable knox\ledge. It is this aspect of independent \:erification \\.hich distinguishes objecti1.e scientific knov.ledye. Hol\.e\.er useful and necessary at times. direct judgments about social program outcomes are part’icularl\ prone to lea\ ing crucial underlying assumptions implicit or unspecified. Of course. mathematical models M.hich are poorli specified can suffer from the same flat+.. and apparent relr\,ance of social sciRegardless of rhe qualit\ ence research. one still needs to make distinctions bet\\,een the political limitations and the methodok)gical limitations on social science research. The following quote suggests an impreciseness on the part of the authors in making such disrinctions: The conditions under u hich evaluatnl-s crperate make e\en the semhlanc~~ of a scientific methodolog\ inlpractical. Evaluators are rarrl\ in the poskon to design programs in such a \vav a< to provide adequate tests for important elements. Claim5 about the dl,ilit\ tu observe results-directi\ 01. tht-ough p-ox‘;\ measures--are frequentI\ pretentious. The notion of &hie\-in: &perimen:al conditions. so central ti) scientific tnquirx. 15a \irtuall\ unattainable ideal. .An enormous number c~f \arialAes rnfltlencinq an\ prcqrsm al-e
Current Literature
220
bevond the control of even the most imaginative and powerful policvmaker. to say nothing of a lowlv evaluator. Yet. in spite of the difficulties of even crude scientific inquiry, poiicvmakers demand assessments of social programs, and evaluators respond, cranking them out with a venyeance. Hoping to bring to their professions a degree of credibilitv that the substance of their work sometimes fails to provide, evaluators have collected the collection may an impressive arrav of tools for their trade. Aithough seem to border on witchcraft. it is intended to impart a sense cA rigorous and systematic inquirv. (pp. 9- 10) Furthermore, important toois minor example
the
authors
do not clearly
explain
the
nature
of
of social science methods. There is one major and one of this that bear discussion. The major problem sur-
rounds their discussion of experimental design, partially suggested bv the above quote. The authors first acknowledge that experiments, presumably with random assignment of program clients to a treatment and a control group, allow one to identify cause and effect between program treatments and program outcomes. In fact, in the absence of strong prior evidence this is the onlv way to identify cause and effect. However, in subsequent discussion, they allude to the “simplistic notions about cause and effect that underiie experimentation strategies (which) fail to capture either the full effect of program intervention or to satisfactorily document the influence of nonexperimental variables” (p. 26). The authors do not clarifv why experimentation fails to capture the full effect of a program. Thev mav be referring to measures of outcome, any one of which is only a partial index of program outcomes, but this issue is divorced from the method of classical experimental design and the issue of identifying cause and effect. The latter half of their statement can be construed as correct, though, if random assignment is made on one variable or set of variables and subsequent analysis is performed on another. For instance, a sample can be randomly assigned to a program treatment on the basis of ethnic origin. The effects of ethnic origin on program outcomes is then random and the expected value of its net effect on program outcomes is unbiased. However. the resulting assignment of the same sample bv sex may be non-random so that sex interacts with program outcomes and biases those estimates. Even then. anaivsis of variance, regression analysis or reiated techniques can eliminate much of this confounding of non-randomness. So, the problem is not severe and is amenable to standard statistical technique. X less serious problem concerns the dichotomy the authors assert between process and impact evaluation. They argue that a process evaluation assesses whether a program is a workable tool for change while an impact analysis assesses the effects of a program in
Current
Literature
221
achieving the desired change. The distinction bettveen the t\+‘o definitions seems clear until one considers the execution of each type of analysis. For instance. the authors pose as an example of’ a process question the follo\ving: “tVil1 unemplo~~ed veterans enroll in rraining courses for auto mechanics?” This is. of course, a process question in the sense that the program administrator and Congress wish to know \vho demands a given program’s services. But it is also an impact question in that the composition of clients and their number re1atii.e to the intended composition and number is a program outcome. If Congress “throwx a part\ and no one attends“ or rhe \vrong people crash it, this is an outcoII1e. an impact. In short, it is best to \-iex program analvsis as a continually in which process and impact are logically linked. Artificial separation of the tjt.0 program aspects is sometimes a convenient approach to evaluate a program. holt~ever. it risks a misunderstanding or misspecification 0 f how a program is designed. operates and affects outcomes. Of course, a careful modeling of a program should immediate&; reveal the process-impact link. hluch of \\.hat is wrong with analysis of sociai programs can be traced to the artificial separation of process from impact. In this revielver’s judgment millions of dollars of I-esearch funds have been wasted thereb!-, or xchat is wxse. have been used to pro\.ide misleading information. Finally. there are a number of instances of obscure or incorrect reasoning. T\vo examples are illustrative. Lholtgh others exist. For instance: To determine if education finance programs had equalized per pupil expenditures. educators might look for mcreases in academic achier-ement among students in poorer schools. (p. 13) Even if indicators (of program performance) are valid. xcurate. and reliable proxies for the variables that need to be examined. correlations do not necessarih imph causation because hdiCatcJrS a!so must have ;I finite number of components if the! are to remain consixent and useful o\er titne. cp. 2 1i
-1s for the tx’o descriptive chapters. they are generally informative to the Iavman \cho is first introduced to rhe issues. Their tone is generally fac&I and \+.hiIe one can disagree x\.ith their judgmentsfor instance. their statement that (3.40 uhas a persistent tendencv to confine its (evaluation) “I-ecomlneild;ltions to minor issues invol\:ing incremental changes”-the intellectual confusion that the methodology chapters exhibit is relatively absent. The chapter on e\,aluation in the esecutiye branch. however. exhibits some polirical naivetk and hisrorical misinterpretation. For instance the authors’ understanding
222
Current
Literature
of the relationship between XSPER and the operating agencies, such as the Occupational Safety and Health .4dministration (OSH,4), is faulty (p. 96 ff). The authors discuss at some length the interaction between ASPER and OSHX. especially as 0SH.S. attempted to execute research which would measure the economic impact of desired health and safetv standards. XSPER is pictured as obstructionist and opposing the spirit of the law by withholding approval of inflationarv impact statements (IIS’s) and otherwise specifying incorrect standards of program impact. .4mong other things, the authors ignore the fact that the IIS’s and ASPER’s role in the approval process were the direct result, and had the full knowledge, of the Ford Administration at the time. They were not generic to the role of ASPER. $Ye should note, though, that cost assessments of social regulation have received support from virtually every part of the political spectrum. Although the Secretary of Labor, by rescinding the secretarial order which created the IIS review process, could have stopped the process of “obstruction”. it is incorrect to sugest that ASPER. or the Secretary of Labor, past or present, had great autonomy of action in this matter. But, given the responsibility to review and pass on the technical quality of the II%, ASPER executed that responsibility as directed. This institutional practice of overview of one agency by another leads us to one final consideration: What is the correct structure fat evaluation and what is its proper locus? While the authors are correct that the effect of the XSPER evaluation of the IISs impeded the OSHA program in the short run, yet would it be socially useful to have OSHX, in effect, evaluate itself? This procedure has generallv been rejected in our society. On the other hand, is it socially useful for an agency such as the GAO, which generally uses a case study monitoring method of evaluation, sometimes of “5~orst” cases, to perform evaluations from which, by definition, insufficient evfidence is presented to generalize to the program as a whole? This approach, too, is misleading and risks committing serious errors of commission. There are no immediate solutions to these types of conflicts. The authors do discuss these matters. but the topic is a difficult one and additional attention could have been devoted to it in then- study. In conclusion, this monograph is perplexing to one who wishes to draw an overall judgment of its utility. It provides some valuable insights concerning problems which surround the institutional and political context of social program evaluation. But it is exaggerated in places and exhibits a misperceived role for social science methods.