0149-7189187 53.00 + .OO Copyright e 1987 Pergamon Journals Ltd
Evaluation and Program Planning, Vol. 10, pp. 27-33,1987 Printed in the USA. All rights reserved.
CONCEPTUAL AND METHODOLOGICAL ISSUES IN EVALUATING EMERGENT PROGRAMS
PATRICIA KLOBUS EDWARDS Virginia Polytechnic Institute and State University
ABSTRACT This paper identifies and discusses conceptual and methodological issues related to five attributes of emergent programs that frequently apply across the spectrum of human services: undefined client populations, inadequate causal evidence relating inputs to outcomes, shifting objectives, identification of criteriafor standardizing treatments, and temporal constraints on program development. A case study of the American Red Cross nutrition program is used to illuminate the evaluation problems associated with these issues to mitigate them. Strategies found to be useful in the evaluation of emergent programs include the examination of the relationship among observed client characteristics and need, external reviews of objectives, identification of client preconditions that support or inhibit performance, examination of normal treatment variation, and a multi-stage process.
relevant to the effective operation of the program. It is important, therefore, to identify common characteristics of programs in the developmental stage and anticipate conceptual and methodological problems that may arise during evaluations of such programs. This paper discusses five attributes that are typically found in emergent programs: (a) undefined client populations; (b) inadequate empirical evidence confirming the relationship of intended program inputs to expected outcomes; (c) ambiguous or shifting program objectives; (d) difficulties in identifying criteria for standardizing treatments; and (e) organizational constraints limiting the period consigned to program development. A case study of the development and formative evaluation of a nutrition program, sponsored by the American Red Cross, is used to illustrate techniques designed to mitigate problems inherent in evaluating programs that possess these five attributes.
There is a dearth of literature focusing specifically on evaluations of emergent programs- those still in the development stage. Yet, the initial period of planning harbors an ideal environment for substantial utilization of evaluation findings. Cronbach, et al. (1980, pp. 235-248) maintain that the intellectual milieu of a new program is quite different from one operating on a sustained or permanent basis. This is particularly significant in terms of administrators’ willingness to make changes in program objectives or strategies (Attkisson, Brown, & Hargreaves, 1978 pp. 88-91). The differences between emerging and established programs also pertain to their environmental context and informational needs (Cronbach et al., 1980; Rossi & Freeman, 1982). Evaluations implemented concurrently with development require flexible models that are adaptive to the dynamic environment of an emergent program. Furthermore, evaluations instituted in the planning stages should be designed to explore the range of variables
The evaluation project discussed in this article was funded by the American Red Cross and the Human Nutrition Information Service, U.S. Department of Agriculture. I would like to thank members of the program development team-Barbara Clarke, Kristen DeMicco, Mary Ann Hankin, Luise Light, Patricia Marsland, Rebecca Mullis, Anne Shaw and Fred Troutman- for their many contributions to the evaluation design. Peter Miller, Peter Rossi, and Harriet Talmage, consultants to the project, provided valuable suggestions regarding the conceptuahaation and implementation of the study. My appreciation is also extended to Fran Cronin, Anne Shaw and Francis Ventre who provided insightful comments on an earlier draft of the paper. Requests for reprints should be sent to Patricia K. Edwards, Urban Studies, Virginia Tech, Blacksburg, VA 24061.
27
PATRICIA
28
THE ARC-USDA
K. EDWARDS
NUTRITION
As a
consequence of the federal government’s encouragement of public/private cooperation in the delivery of human services and market research conducted by the American Red Cross (ARC), in 1981 the United States Department of Agriculture (USDA) and ARC entered into an agreement to develop a nutrition program which would be offered in the more than 3000 Red Cross Chapters nationwide. The premise of the program, subsequently called “Better Eating for Better Health,” is to promote knowledge, beliefs, and behaviors which will enhance the public’s nutrition status. The program introduces a new food guidance system unlike the system traditionally used in nutrition education. Nutrition experts from ARC, USDA, and a state university cooperated in designing the program. This program development team, cognizant of the potential benefits of evaluation, recommended that an evaluation be initiated concurrently with the program’s development and that the program be tested in several stages before it was offered nationally. Formative evaluations had not been employed frequently in the field of nutrition education, and the program development team was eager to learn more about how best to provide nutrition information to the public, as well as to examine the basic assumptions underlying nutrition education. Consequently, a team of evaluators was engaged to work cooperatively with the program development team during the design phases. The conceptual model for the evaluation, formulated through the joint efforts of the program development team and evaluators, focused on an array of delivery and performance variables: (a) The sociodemographic characteristics and information needs of participants; (b) the delivered treatment variables (i.e., course content, instruction methods, visual materials, course activities); (c) the implementation system (i.e., chapter attributes, qualifications and experience of instructors, course revenues and expenditures, and promotional strategies); and (d) the relationship between the first three sets of variables and changes in participant’s nutrition knowledge, beliefs, and behaviors. Table 1 shows the five stages of data collection and analysis undertaken during the nutrition program evaluation. The first three stages relied heavily on qualitaPROBLEMS
OF CONCURRENT
PROGRAM
Undefined Client Populations Knowledge of client characteristics that are relevant to program delivery is a fundamental requisite for planr-ring(Landsberg, 1983). However, many human service programs, such as those provided by recreation, health, and housing agencies, are aimed at self-selected client groups. Even when eligibility guidelines are established
PROGRAM
EVALUATION
tive information derived from expert analyses of program content and design, as well as written commentary obtained through semi-structured questionnaires completed by instructors, participants, and observers. Data for the two final stages of the evaluation were collected by means of self-administered surveys of participants and controls prior to and after the intervention, instructors after the end of their course, and ARC chapter administrators upon completion of courses of their site. A telephone survey of a sample of participants was also conducted two months after Pilot Test III, and again at the same interval after the Field Test, to study the intervention’s sustained effects. Subsequent to each stage, data were analyzed and the program was modified accordingly.
TABLE1 PROCESS MODEL OF THE ARC-USDA NUTRITION PROGRAM EVALUATION Stage 1 Technical Review and Curriculum Design Analysis Expert analysis of the appropriateness of program objectives for the projected client population; relationship of content to program objectives; technical accuracy of content; sequencing of materials; teaching unit size; accuracy and currency of vocabulary. While a formal analysis was conducted during stage 1, the technical review was an ongoing process undertaken by USDA throughout the period of program development. Assessment and Learner Stage 2 Pilot Test I-Instructor Verlflcatlon Critical examination of program materials and design by potential instructors and representative course participants. Stage 3 Pilot Test II Pre-test of program materials and teaching Btrategies observed by the program development and evaluation teams in six sites. Stage 4 Pllot Test Ill Pretest of the overall program and survey instruments in ten sites, without supervision from the development and evaluation teams. Stage 5 Field Test National test of the program (51 sites), designed to provide comprehensive data for analysis of the reiationship between participant preconditions, delivered treatment variables, the implementation system, and program outcomes.
DEVELOPMENT
AND EVALUATION
by legislation and the target population has been identified, administrators of new programs frequently do not know who will actually make use of available services. In addition, it cannot be assumed that the characteristics of local clients will be representative of the overall target population. A study of small farm programs in 14 southern states, for example, indicated
Evaluating Emergent Programs that the average income of farmers differed substantially from state to state, although all participants fell within the national guidelines. Not surprisingly, the effectiveness of various program strategies varied among the states studied (Edwards, Orden, & Buccola, 1980). Futhermore, the criteria for determining eligibility may not be appropriate and/or sufficient for identifying client attributes that should be considered in defining objectives and formulating effective service delivery strategies. Thus, while emergent programs may be aimed at specific target populations, the characteristics of clients (facts that bear on how a program can most effectively operate) are often unknown. This unknown element of the potential client population presents a problem for evaluators. Evalutation designs frequently incorporate analyses of the relationships between client attributes and outcomes for the purpose of specifying program effects. Moreover, evaluation data are often elicited from the clients themselves. The format and style of survey instruments, for example, may be based on an erroneous understanding of the abilities of respondents. Consequently, lack of pertinent information on the client population is not only a serious problem in planning a program, but it also has an effect on the conceptualization and design of evaluations that are implemented concurrently with program development. Educators have long held that curricula and teaching strategies ought to be geared to the abilities, needs, and expectations of prospective clients. Recognizing this, the nutrition program development team based the course on two assumptions about potential participants. First, the team assumed that participants would attend to improve their knowledge of nutrition. Second, the team predicted that the course would draw from the same pool of participants that took other courses offered by the organization- middle class, female high school graduates, between the ages of 25 and 55, who did not work full-time. The evaluators discovered that these assumptions were based on intuition, as opposed to data collected systematically from participants in other courses. The first assumption concerning the reasons for participation, if problematic, could have an effect on attaining the goals of the program. Observation in Pilot Test II and data obtained from unstructured questions in Pilot Test III suggested that some participants, particularly older individuals, attended simply for the social activity the program afforded. Others indicated that they participated to receive certification that would enhance their ability to obtain jobs in healthrelated fields. Because differences in the motivations and needs of clients may have an effect on anticipated outcomes, these factors should be included in a conceptual model. The evaluators developed a profile of participants,
29
drawn from data obtained in Pilot II, to test the program development team’s second assumption. These data indicated that although the attendees were predominantly female, the distribution according to educational attainment and family income was at a lower level than anticipated. This posed several problems in terms of data collection and analysis, since survey instruments were based on a level of difficulty appropriate for high school graduates. As a consequence, the questionnaires and survey procedures were modified for a more generalized audience. The fivestage evaluation process used in the Red Cross study provided an opportunity for both the program development and evaluation teams to address the programmatic and evaluative consequences of unanticipated client characteristics. Inadequate Causal Evidence Relating Program Inputs to Outcomes Emergent human service programs are generally based on theoretical notions, rather than empirical evidence concerning the effects of an intervention (Neighbor & Metlay, 1983). While this may be due, in part, to the absence of validated theories linking projected strategies and outcomes, Cronbach et al. (1980) aptly argue the limitations of experimental research techniques in providing reliable causal statements that can account for the range of possible outcomes resulting from a social program. In the field of nutrition the evidence is not conclusive that individuals who improve their nutrition knowledge and develop appropriate dietary beliefs will indeed modify their eating habits accordingly (cf., St. Pierre, Cook, & Straw, 1981; Rosander & Sims, 1981). An array of constraints impinge upon nutrition behavior change: family influences, economics, and the ability to control food decisions are examples. Cognizant of these problems, the program development team was reluctant to tie the success or failure of the program to behavioral objectives which could only be measured on a short-term basis in our study. Although the evaluators agreed that it would be unwise to focus on behavioral outcomes, they supported the inclusion of behavior measures in the conceptual design of the study for two reasons: baseline behavioral measures could serve as needs assessment data that would provide useful information for refinement of objectives and strategies; and, while the performance of the program should not be tied to behavioral objectives, this study provided an exceptional opportunity to articulate and test theories linking nutrition knowledge, beliefs, and behaviors of a heterogeneous population. Consequently, the program development and evaluation teams jointly decided to identify a broad range of performance variables that could potentially impinge upon long-range behavioral changes or indicate participant interest and satisfaction with various
30
PATRICIA K. EDWARDS
aspects of program delivery. An array of outcomes (i.e., stimulation of the use of nutrition information resources, level of attendance, participant’s satisfaction with the relevance of the program, and the amount of course materials participants read) were considered as performance measures, along with changes in nutrition knowledge, beliefs, and behaviors. The team members further acknowledged that other potentially positive effects, beyond the scope of the study, might be attributed to the program (i.e., enhancement of job opportunities). Additionally, an effort was made to include in the conceptual model measures of conditions which could support or inhibit the learner’s propensity to achieve desired outcomes. Finally, the evaluators stressed that the examination of outcomes should constitute a minor role in the total evaluation effort. Emphasis was placed on obtaining data that would provide information to improve the technical accuracy, organization of materials, clarity, relevance, and enjoyment of the nutrition program. An experimental design to test causal linkages was not feasible methodologically. Randomly assigned experimental and control groups are virtually impossible to construct for emergent programs based on selfselected participants, as was the case with the ARC nutrition course. Moreover, ARC administrators opposed a delayed control group design. The team’s recourse was to use non-equivalent controls obtained from participants simultaneously attending other ARC courses during the test periods. In this way, specific segments of the anticipated client population that were relevant to the program development team’s concerns, for example, senior citizens, singles and low income families, could be compared in terms of knowledge, belief, and behavioral change. This quasi-experimental research design allowed the team to further its understanding of the relationships between nutrition knowledge, beliefs, and behaviors (see Edwards, Acock, & Johnston, 1985). Ambiguous or Shifting Program Objectives A third characteristic of emergent human service programs, closely related to the first two attributes, concerns the uncertainty of objectives. Many programs are based on preconceptions about specific client needs which have not been empirically validated. Often the initial stages of development constitute a test of program planners’ ideas. As a result, objectives and priorities may be in a state of continual revision. Although this can be a positive characteristic of emerging programs, it nonetheless poses a challenge for evaluators. The objectives of nutrition education are particularly dynamic for several reasons. New research findings often advance and even contradict the current state of knowledge. In addition, nutritionists are not entirely knowledgeable about what potential clients already
know. Misinformation on diet regimens and the efficacy of megadose vitamin supplements in the popular media further confound the problem. Moreover, nutritionists are divided on the approaches to teaching nutrition, some emphasize practical food selection, others focus on nutrients. In addition, the level of technical knowledge that can be absorbed by learners is problematic. Although members of the nutrition program development team had essentially similar perspectives on the thrust of the course, each of these issues arose during the course of our study. As a consequence, program objectives, and even strategies, were in a state of constant change. Conceptually, the evaluators’ concerns centered around the possible reactive effects of the evaluation on the changes made by the program development team. While formative evaluation can foster the clarification and specificity of objectives, and may function to mediate differences among staff, the team was aware that the cooperative milieu in which they were working could bias the formulation of objectives. Were the team’s attempts to measure phenomena influencing program objectives? Certainly there was evidence that the reactive effects between planning and evaluation which numerous evaluators (e.g., Cronbach et al., 1980; Glenwick, Stephens, & Maher, 1984) warn about were occurring, since the program development team maintained that their active involvement in designing the evaluation helped them to more clearly articulate their objectives. Yet, the evaluation team did not want to forego the cooperative process intended to promote the program development team’s effective utilization of evaluation results (see Dawson & D’Amico, 1985). One solution was to incorporate into the process external reviewers who were not involved in the evaluation design. Consequently, in Stage 1, experts in nutrition education and curriculum design provided critical feedback on the initial objectives. The next three stages called for experienced Red Cross instructors and representative course participants to react to the appropriateness and clarity of the objectives using semistructured questionnaires. Empirical data obtained from structured questions measuring the baseline level of participants’ nutrition knowledge, beliefs, and behaviors were also used to test assumptions regarding participant needs. Shifting program objectives produced a number of methodological problems. Stages 3 and 4 of the process were viewed by the evaluators as an opportunity to test the reliability and validity of various scales constructed to measure performance in the Field Test. The team was often unable to use the same set of questions successively as a consequence of the revised objectives. Yet, it was important to stress the team’s willingness to maintain flexibility in the face of program modifica-
Evaluating Emergent Programs tions, even at the risk of sacrificing methodological vigor. Difficulties in Standardizing Treatments In the development stages of a human service program there is generally very little attention paid to the potential effects of treatment variation. To some extent, this is due to limited knowledge of those effects or to a perceived inability to control treatments. Evaluation undertaken concurrently with program development should address these issues, because it is difficult to change local patterns of operation after a program has been implemented over a period of time. Health education courses offered by local ARC chapters are expected to comply with certain guidelines established by the National Headquarters. Two sets of training modules are offered for all instructors: a core training module, providing general information about ARC operations and approved instruction techniques, and a more specific training module addressing the content and teaching strategies of the particular course an instructor will teach. Because ARC course participants recieve certification for various levels of achievement, the National Headquarters stresses maintenance of standards in these instructor training sessions. The mechanisms for monitoring local chapters or controlling quality, though, are limited. Both the evaluation and program development teams felt the issue of standardized treatments should be incorporated into the conceptual design of the study. The evaluation team decided to test the program in as naturalistic a setting as possible in the last Pilot Test and the Field Test. By identifying and measuring program variation variables that might theoretically influence performance, we could examine their effects and identify appropriate standards. Empirical evidence regarding specific program implementation standards would then be conveyed to focal chapter administrators and instructors. It was hoped that such information might persuade local personnel to comply with these national guidelines. The variables jointly identified by the teams included: the degree to which instructors followed each segment of the course design and their use (or non-use) of the instructional aids provided, the training (nutritionist/ dietitian versus other health-related or home economics fields) and experience of instructors, team teaching versus individu~ instruction, the number of instructional hours, and the number of weeks a course was offered. Considerable variation was found. For example, although the course was intended to be taught in six two-hour segments, some instructors taught their course in one or two days. Others deleted segments integral to the objectives of the program and often added their own material. The effects of these variations are reported elsewhere (Edwards, 1984).
31
Differences in the length of time a course was conducted led to another methodological problem: the evaluation team could not examine behavior changes among participants who completed their course in one or two days. After consultation with the program development team, guidelines for program impIementation that specified the number of weeks over which a course would be offered and the number of instructional hours were instituted in the Field Test. These parameters were consistent with the standards for other ARC courses and would constitute appropriate procedures for the nutrition program when it was in full operation. While many human service programs cannot and, perhaps, should not completely control program conformity on a local level, analysis of deviation during the developmental stages of a program can prove to be efficacious for establishing a monitoring system. Temporal Constraints on Program development Program administrators are well aware that the frontend time designated for development is costly and few human service agencies can afford to allocate sufficient staff and resources, particularly if an extended period of time is required. In the case of the nutrition program, the sponsoring agencies, ARC and USDA, provided what initially seemed to be an appropriate amount of time for design and testing- approximately three years. But even this time frame posed problems for the development and evaluation teams. First, many actors were involved. They were geographically sep arated and operated with varying organizational constraints on their time. Second, both the program development and evaluation teams were overly ambitious in what they wished to accomplish. Delays that constrained the evaluation process were encountered throughout each stage (e.g., the inability of local ARC chapters to initiate courses on schedule, or slippage in the printing of materials). These delays meant that feedback from the evaluation did not always provide sufficient time for the program development team to make appropriate modi~cations before the next phase of the evaluation was due to commence. ARC hired a project coordinator who tactfully and successfully prompted both teams to meet deadlines. Moreover, both teams learned to anticipate delays. But because they underestimate the time needed between stages, it was sometimes impossible to consider the breadth of program changes in the subsequent evaIuation design. In a project of this type, which requires intensive team interaction and feedback, sufficient lead time must be allocated to digest the findings and modify the program after each successive stage of the process (Patton, 1980, p. 187).
32
PATRICIA
K. EDWARDS
DISCUSSION Initiation of evaluation concurrently with program development can maximize the potential of human service delivery. Evaluators, however, must be wary of the pitfalls inherent in such endeavors. By identifying common characteristics of emerent programs that relate to the choice of evaluation techniques, conceptual and methodological errors may be avoided or at least minimized. The characteristics shared by emergent programs discussed here do not constitute the totality of traits impinging on evaluations in the developmental stages of a human service program. But the experience of evaluating the ARC nutrition program provided several insights regarding the conceptualization and design of evaluations for emergent programs that may prove instructive. l
l
l
l
Assumptions about the characteristics of intended target populations cannot be taken for granted, especially in emergent programs where clients are selfselected. Examination of client attributes relevant to program planning should be incorporated into the conceptualization of most formative evaluations. Clients may come into the service environment with life experiences or conditions that support or inhibit their ability to achieve desired outcomes. Identification and inclusion of these factors in the conceptual design of an evaluation is integral to understanding how program strategies work. The performance of emergent human service programs should not be tied to untested assumptions about potential outcomes. Recognition of the broad aims of a program is essential, even if the evaluation is unable to address all possible consequences. When program staff members are involved in the conceptualization of an evaluation it is important to
l
l
l
l
incorporate external reviewers in the process to ensure that the evaluation does not dictate the shifting of program objectives. Normal treatment variation should be permitted during program development. Examination of observed variation provides valuable information regarding formulation of criteria for standardizing treatments. Restriction of a program in order to maintain an experimental design is often not acceptable to administrators. Thus, tests of emergent programs must be sufficiently flexible, methodologically, to operate in an environment analogous to the actual setting anticipated. Evaluation of emergent programs is most viable when the process is divided into several stages. Such a division provides the program staff with an opportunity to make changes and allows evaluators to address unanticipated situations in the conceptual and methodological design. Evaluators of emergent programs need to carefully estimate the amount of the time necessary for changes and revisions in the evaluation design between each stage of an evaluative process,
This paper has attempted to develop a more systematic understanding of how the nature of emergent programs may affect evaluations, conceptually and methodologically. Although each human service program has unique attributes that must be considered in an evaluation, a body of knowledge relating common characteristics among classes of human services to evaluation strategies will facilitate and, it is hoped, improve the work of evaluators. In this manner we can more fully develop a theory of evaluation research.
REFERENCES ATTKISSON, C. C., BROWN, T. R., & HARGREAVES, W. A. (1978). Roles and functions of evaluation in human programs. In C. C. Attkisson, W. A. Hargreaves, M. J. Horowitz, and J. E. Sorensen (Eds.), Evaluation ofhuman serviceprograms(pp. 59-95). New York: Academic Press. CRONBACH, L. J., et al. (1980). Toward reform ofprogram evaiuation. San Francisco: Jossey-Bass. DAWSON, J. A., & D’AMICO, J. J. (1985). Involving program staff in evaluation studies: A strategy for increasing information use and enriching the data base. Evaluation Review, 9, 173-188.
Nutrition behavior change: Outcomes of an educational approach. Evaluation Review, 9, 441459. EDWARDS, P. K., ORDEN, D., & BUCCOLA, S. T. (1980). Evaluating human service programs with differentiated constituencies. Journal of Applied Behavioral Science, 16, 13-27. GLENWICK, D. S., STEPHENS, M. A. P., & MAHER, C. A. (1984). On considering the unintended impact of evaluation: Reactive distortions in program goals and activities. Evaluation and frogram Planning, 7, 321-327.
EDWARDS, P. K. (1984). The American Red Cross nutrition course: Findings from the field test. In Agriculture Outlook 85 (pp. 5X-586). Washington, DC: U.S. Department of Agriculture.
LANDSBERG, G. (1983). Program utilization and service utilization studies: A key tool for evaluation. In A. J. Love, (Ed.), Develaping effective internal evaluation, New directions for program evaluation, no. 20 (pp. 93-102). San Francisco: Jossey-Bass.
EDWARDS, P. K., ACOCK, A. S., & JOHNSTON, R. L. (1985).
NEIGHBOR, W. D., & METLAY. W. (1983). Values and methods:
Evaluating Emergent Programs Evaluation and management perspectives. In A. J. Love, (Ed.), Developing effective internal evaluation, New directions for program evaluation, no. 20 (pp. 575-586). San Francisco: Jossey-Bass. PATTON, M. Q. (1980). Qualitative evaluation methods. Beverly Hills, CA: Sage Publications. ROSANDER, K., &SIMS, L. S. (1981). Measuring effects of an af-
33
fective-based nutrition education intervention. Journal of Nutrition Education 13, 102-105. ROSSI, P. H., & FREEMAN, H. E. (1982). Evaluation: A systematic approach. Beverly Hills, CA: Sage Publications. ST. PIERRE, R. G., COOK, T. D., &STRAW, R. B. (1981). An evaluation of the nutrition education and training program. Evaluution and Progrmn Planning, 4, 335-344.