Interviews and Panel Discussions
The Fine Line between Evaluation and Explanation MICHAEL SCRIVEN
In program evaluation, we are concerned to establish the merit, worth, quality, or value of programs, in whole or in part, at the request of some client or clients, and for the benefit of some audience. To do this we do not need to know how the programs work or why they fail to work, or even what their components are. Black box evaluation is not a contradiction in terms. The situation is no different in principle from that in product evaluation, where we have to use professional engineers with elaborate testing machinery to check whether an elevated highway is deflecting excessively after it has been in place for many years. We can decide on that evaluative conclusion without knowing whether this is an after-effect of an earthquake or due to gradual deterioration from fatigue, and without having any idea how, or if, the problem can be fixed. Now of course it’s an advantage if we can do more. We hope to be able to diagnose the cause of the trouble-or the secret of success-and even perhaps suggest a cure when it’s required. But that’s the dream of matching the physician, the Good Doctor dream, and one should remember that it is simply an ideal, not a necessity. The necessity is to get the evaluation right, and jeopardizing that by diversion of effort in the direction of explanation, diagnosis, remediation, is all too common. Sometimes the request for evaluation is for what might be called an analytic rather than a global evaluation. There are two kinds of analytic evaluation, one involving study of the merit of the components of the program, the other requiring study of certain dimensions or aspects of the program’s performance. Notice that even a component evaluation does not require any understanding of the relation between the components. It is essentially a multiple evaluation. One can evaluate a college in terms of the quality of its departments and not just in terms of global parameters, but not understand anything about how the departments relate to each other, or the management of the system. It is even more obvious that dimensional evaluation does not require understanding of the workings of the program. You can evaluate a program in terms of its effectiveness without looking at cost; its cost without looking at Micb4 shriven l
Evaluation & Developmment Group, Box 69, Point Rcyes, CA 94956.
EvaIuatIon Practice, Vol. IS. No. I, 1994, pp. 75-77.
Copyright e 1994 by JAI Press, Inc. All rights of reproduction in any form reserved.
ISSN: 0886-1633
75
5
EVALUATION PRACTICE, 15(l), 1994
s effectiveness; and so on with other dimensions of merit such as exportability, ease of affing, etc. The professional imperative of the evaluator is to evaluate; anything else is icing on ie cake. If that principle is kept in mind, then we can make some concessions. For example, iere are many situations where a recommendation “falls out” of the investigation-for sample, where a weakness in the admissions system for a clinical program becomes bvious as soon as we get into consumer interviewing. Of course, the evaluator must be n the lookout for this serendipitous fall-out. Should one alter the evaluation design to increase the likelihood of it happening? lot unless the budget and timeline affords room for this expansion as well as a cushion ?r unforeseen contingencies. And not unless the client is made well aware of what risks re involved. Of course, the client is usually as much at fault in bundling a request for explanations nd recommendations into the contract package, as evaluators who have poorly anceptualized their obligations and abilities. Clients think they get these as part of the ackage-and are amazed to hear that is not so. Where one has or can add the ‘local’ xpertise, e.g., in management consulting, to undertake to provide explanations and :commendations, then of course, one should price it out as a further service and package with the evaluation. But that situation is rare. It’s more common that the evaluator tempted to take on the recommendation and explanation tasks, and then tempted to lake what are in fact amateurish suggestions along with the evaluative conclusions. One rould keep in mind that the science of management does not exist, that much of what rives or crashes a program is organizational psychology and that the degree of agreement bout this is nearly negligible. Which of these program theories will you hitch your star )? Why? Is there something so simple about program evaluation that you need something .se to do? One trap that gets people into the explanation game concerns causation. It’s clear iat a substantial part of program evaluation concerns the identification of outcomes. nitcomes are phenomena that are caused by the program. In a loose sense, the program xplains their occurrence. So, one often hears it said, evaluation is impossible without xplanation, without-in this sense-a theory about how the program operates. But this a theory about how the program operates externally, about the kind of effects it produces. nowing something about that is helpful in several respects, e. g., in designing the outcome ivestigation, and in estimating the likelihood of successful replications. The internal peration of the program is the chapter of the book you don’t want to say you know. nd without knowing that, there are severe, often total, limits on what you can say to [plain which components are doing the good work, and which are causing trouble. Nevertheless, if all one is claiming about the need for program theory is that one eeds to know something about the kind of effects a program can have, that’s a legitimate #aim. One can go a little further, to what might be called gray box evaluation. In black ox evaluation, one knows nothing about the inner workings of the program. In clear ox evaluation, the inner workings are fully revealed. In gray box evaluation, one can mply discern the components, although not their principles of operation. There are many sses where it is a reasonable expectation on the part of the client that the evaluator does now or can reliably determine what the components are.
Evaluation and Explanation
77
However, it does not follow from knowing the components, and even from knowing how to evaluate each of them for quality and functionality, that one can equate the evaluation of the program with the evaluation of all its components. For there is the matter of the way in which they are connected, and evaluating that is a hard task. There one needs program theory of a kind which few evaluators have, and more seriously, there are many cases where noone has it. Thus the ability to do the black and gray box evaluations are distinct; neither does what the other does, and either the first or both are necessary, whereas the white box evaluation is rarely possible although frequently assumed to be an essential part of evaluation. It is worth mentioning one other trap. This trap involves being seduced by the suggestion that one can’t be helpful unless one can make recommendations-and of course making those requires some kind of theory about what would happen if they were followed. This point is sometimes put by saying that obviously formative evaluation has to be based on program theory, or else how could it diagnose the problems and generate solutions. Again we can see past this trap if we keep the logical nature of evaluation firmly in mind. It is extremely helpful for someone to know how well they are doing, without any trappings and recommendations; this is why runners use stopwatches, shooters use spotting scopes, and professionals use the results of performance reviews. Informing the client of that, in the complex cases where professional help is needed, and perhaps with some relevant comparisons to provide more meaning to the bare bones, is what the evaluator is absolutely responsible for doing. There may be a local expert who can suggest what the evaluee should do if the news from the evaluator is not the right kind of news; it may even happen to be the case that the evaluator is qualified to add that remediation advice to the evaluation itself. But let us be quite clear: the evaluator can be extremely helpful without providing explanations of the shortfall against the ideal, or recommendation as to how it should be remedied. Doing this is often very difficult; undertaking more is very risky (much more risky). This is the message to keep in mind: formative evaluation is not only possible without explanations and recommendations, it is essentially distinct from providing them, and it can have great utility without them.