Dialogue with marsha mueller

Dialogue with marsha mueller

Exemplary Evaluations 87 Dialogue with Marsha Mueller JODY L. FITZPATRICK Editor: Mueller: Editor: Mueller: Marsha, the evaluation you and Bett...

1MB Sizes 0 Downloads 41 Views

Exemplary Evaluations

87

Dialogue with Marsha Mueller JODY L. FITZPATRICK

Editor:

Mueller:

Editor:

Mueller:

Marsha, the evaluation you and Betty Cooke conducted won the AEA Award for Exemplary Evaluations in 1996. Obviously, there are many things which make an evaluation exemplary. What do you see as the most exemplary aspect of your study? As I understand it, we received the award because of the involvement of staff in the evaluation process. I like involving users in evaluations, but I get really excited when users expect to be involved. This study is worth taking a look at because it shows what can Jody Fitzpatrick happen when motivated evaluation users work collaboratively with evaluators. We created a demanding plan, given our budget. The only way you could do what we did was to work as partners. Your use of staff in the evaluation was a primary piece our readers would be interested in learning about. So, let’s spend a little bit of time talking about that. Your use of on-site evaluators, people who also have program delivery responsibilities at each site, draws on participatory evaluation and empowerment evaluation concepts. What do you see as the primary advantages of using program personnel as on-site evaluators? Our site evaluators were ECFE staff and members of Family Education Resources of Minnesota (FERM), the nonprofit organization created to support training and evaluation of ECFE programs. Staff involvement was essential since one of the goals of the study was to build organizational capacity. Involving staff as evaluators also has benefits for families, the program participants. Staff participation in the evaluation helps family educators think critically about families and about what they’re doing. Another advantage is an economic one. Our grant was for only $150,000. We also had $20,000 from the Legislature. Without staff participation this project would not have happened; it would have been impossible. Because staff did all the data collection we were able to make very strategic use of special help-the independent raters, for example, and nationally recognized experts, when we needed them. We were able to maximize the use of the dollars we had available for the evaluation. The skill of staff in working with families was another advantage. We needed data collectors who could establish rapport with potential study participants immediately-within a week after they enrolled in ECFE. In other words, we needed people who were used to interacting with families with young children. Finally, staff were a source of program intelligence. They helped us understand the intent of the program, its context and rationale, how staff make judgments about families and what they are trying to accomplish. Their insight

88

Editor: Mueller:

Editor:

Mueller:

AMERICAN JOURNAL OF EVALUATION, 19(l), 1998 helped make the analysis relevant to the program. The program staff contributed heavily to design decisions. We talk a lot today about building organizational capacity and thinking critically. Tell me more about what these terms mean in your project. We wanted to do two things. First, we wanted to incrementally expand the pockets of evaluation expertise that are evolving in ECFE. Second, we wanted to enhance staff understanding of the families. We involved 50 ECFE staff; 28 served as site evaluators and 22 were members of FERM. Most are family educators, people who work directly with parents and young children. Yet, in their dayto-day work, they do not always get to know families as well as they did during the evaluation. In typical, day-to-day work with families, they usually work with groups of parents and children in an educational setting. During data collection, they were with individual families in their homes; they had extended, focused conversations with parents; and they took a very systematic look at those conversations when they analyzed transcripts. The data collection and analysis process helped staff understand families in new ways. In looking systematically, staff viewed things differently. For example, they were often surprised at the range of skills and diversity of characteristics we found within our sample of low-income families. For many staff, what we learned about the characteristics and skills of families challenged their mental models of what low-income families are like. Staff were also intrigued by what we learned from examining different perspectives of change. A common observation of site evaluators following completion of the final interview with parents was that the parents didn’t talk about all the ways their parenting had changed, things the staff had observed when parents interacted with their child at the program site. Parent survey results, however, showed that many more parents felt their behavior had changed than was apparent from staff or independent rater assessments. Recognition of different kinds of perspectives and discrepancies in findings provided opportunities for staff to discuss their practice in new ways. So, when I talk about evaluation as capacity building I mean thinking about and using the evaluation process as a learning opportunity, a form of staff development or organizational investment, in tandem with conventional purposes. Many evaluators, however, are concerned about making extensive use of program personnel to collect data. What concerns did you have? What steps did you take to avoid problems? Of course we had concerns about data quality-how people go about collecting the data, how they handle it once they’ve got it, how they relate to study participants. I try to handle these concerns by understanding what people (FERM members and site evaluators in this case) are bringing to the table-their expectations for the evaluation, their skills or experience, and most importantly, their commitment to the task. My job is to get to know an evaluation team very thoroughly. I’ve been involved in probably 50 evaluations, most of which used staff in one way or another. My first task is to get to know staff so I can plan appropriate training or other support into the evaluation process. In this case, we provided training at four intervals and allowed for a lot of practice up-front. We had a real technical mix of things staff needed to do. It was essential that they be comfortable with the tech-

Exemplary Evaluations

Editor:

Mueller:

Editor: Mueller:

Editor: Mueller: Editor: Mueller:

Editor:

89

nologies they were going to use, the tape recorder, video camera, approaching families and engaging them in the study as well as understanding the study and evaluation logic. We had a six-month pilot phase which was very important for learning and fine-tuning skills-whether it was interviewing, managing the data, organizing data files, taking care of transcripts, or whatever. I prepared two very detailed handbooks which gave staff suggestions, down to specifics on coding, the structure for transcripts, timelines, etc. The pilot also gave me an idea, based on my review of pilot data and conversations with site evaluators, about the quality of the information we would get in the study. And that was really important. We had a complex plan; we were taking some big risks. If staff could not meet quality standards or figure out how to implement the process in their district, I wanted to find that out during the pilot. Betty and I also built in an auditing process. Each family file (this included audio tapes, transcripts, survey responses, video tapes and site evaluator process notes) was reviewed by at least two different people (another site evaluator and a graduate research assistant). Betty and I also reviewed a sample of family files. Who were the on-site evaluators? What other roles and responsibilities did they have? What was their relationship to you? to their organization? How were they selected? Site evaluators were family educators and school district employees. We were partners in this endeavor, neither Betty nor I was their supervisor. Site evaluators were selected by their district coordinator. All had programming or administrative responsibilities in addition to their participation in the evaluation. Did you develop any guidelines for the district coordinator to use in selecting site evaluators? We asked the coordinator to select people who were interested and wanted to learn more about their families and evaluation. We wanted people who were committed to the idea. Obviously, we also needed people who were willing to commit the time to do it. You mentioned that some of the site evaluators were administrators. Did they see families? Yes, ECFE is hierarchically flat; their people perform a variety of tasks and most administrators see families. What was the educational background of the evaluators? And what were their expectations of evaluation? We had a mix of bachelor’s and master’s level people, even some Ph.D. folks. We were able to do some sophisticated stuff; by that I mean having staff take care of a full range of evaluation tasks and using multiple methods to collect data, not so much due to their educational level as their interest and commitment to the evaluation. Some had been involved in prior ECFE evaluations. Some had not. It was a purposeful mix of expertise. We wanted to expand the pockets of evaluation expertise that are evolving in ECFE, thoughtfully. In many cases more experienced site evaluators were paired with novices. I noticed the site evaluators used the Stimulated Response Interviews (SRI’s) as an opportunity for teaching. Tell us about that.

90 Mueller:

Editor:

Mueller:

AMERICAN JOURNAL OF EVALUATION, 19(l), 1998 This is something we added based on staff experience in the pilot phase. The SRI involves parents viewing segments of the videotapes and then responding to questions about what they observed. The questions posed to parents are very openended. For example, “What’s going on here? What are your reactions to what you saw?’ Parent responses are tape recorded and transcribed. The intent of the SRI is to understand how parents talk about their behavior with their child when confronted with it on a videotape. In the pilot, site evaluators realized there were a lot of good things happening in the videotape that parents didn’t talk about, but the evaluators couldn’t say anything about them without influencing parent responses. As parent educators, however, they felt they were missing an opportunity to reinforce good parenting skills. We decided to build in a teaching opportunity. After the interview was completed, the site evaluators, if they wanted to, could turn off the tape recorder and say, “There are a couple of things I really want to show you.” This became a very powerful teaching tool and a piece of the evaluation methodology that districts are modifying and incorporating in their work. When we analyzed information from parents about outcomes, study families reported more changes in their children than did other low-income program families. Our hunch is that this was due to participation in the study, particularly the learning aspect of the SRI which stressed observation and involved discussion of parent behavior with the parent. (Editor’s Note: We see here a phenomenon not unusual in evaluation and research-that of changes in study participants being due, somewhat, to the study itselJ i.e., the Hawthorne effect. What is noteworthy here is that the program deliverers, seeing the beneficial effects of some components of the research, incorporated these into their future programs. While the SRI was an additional component that the study families experienced, they also received more individual attention through home visits and interviews. Note Mueller’s observation that site evaluators learned more because they usually dealt with families in groups onsite; the study gave them an opportunity to make individual home visits.) Let me move away from staff involvement for a minute and ask you about the evaluation overall. What was the impetus for your study? Who initiated it and why? This was the second outcome study of ECFE. The first study-a $20,000 grant from the legislature to look at outcomes-made use of staff to interview parents about outcomes. The sample for that study was mainly middle-income mothers. The results were positive, but questions emerged: What can we learn about lowincome families in ECFE? What kinds of outcomes do low-income families demonstrate? So, for the second study the focus was on what can we learn rather than responding to a legislative mandate. It was a discovery rather than an application to a direct policy decision. At the request of FERM, Betty Cooke submitted a proposal to the McKnight Foundation for evaluation funds. McKnight awarded FERM $150,000 for the study. The same year, the Minnesota legislature earmarked $10,000 per year for evaluation of ECFE. (Editor’s Note: This use of evaluation for discovery rather than for a direct application to a decision is one that has become more widely recognized in recent years. Direct, instrumental use occurs and, as Cook (1997) has recently argued,

ExemplaryEvaluations

Editor: Mueller:

Editor: Mueller:

Editor:

Mueller:

Editor: Mueller:

91

may be more frequent than we have suspected. However, the use of evaluation for enlightenment is an accepted and valued purpose.) Who did you see as the primary audiences for information? The legislature? I make a clear distinction between primary intended users and evaluation audiences. The primary intended users of the evaluation information and process were ECFE staff, the approximately 2,000 family educators working in Minnesota school districts. They had questions about outcomes and wanted information about outcomes to help them make decisions about how to improve their work with families. Staff also wanted information about outcomes to share with legislators. The second purpose of the evaluation, as I mentioned before, was building organizational capacity, so ECFE staff were also the primary intended users of the evaluation process-certainly not the legislature, for they could care less about process use. Audiences for the evaluation, those who may need or want information about the program and program outcomes, included legislators and other professionals around the country involved in family support and education programs. You note that a 60-member team participated in planning of all phases of the study. How did that work? That’s a large group to process. This team never met as an entire group. The nine group meetings held during the 30-month project included three planning sessions with FERM, five one- or twoday workshops for site evaluators and one meeting with FERM members and site evaluators to review findings and formulate recommendations. Group meetings and workshops were carefully planned. Each meeting was designed to accomplish evaluation tasks and to provide training and opportunities for staff to discuss what they were learning about families and evaluation. The questions for the study were really established with the FERM group though there’s overlap between the site evaluators and FERM. So FERM members were the ones involved in the major planning sessions of the study. Most of your interactions with on-site evaluators were in training for data collection though they were involved in a final meeting with FERM to review findings? Keep in mind that almost all FERM members are ECFE staff and some FERM members served as site evaluators. Evaluation purposes and questions and design decisions were established with FERM. We did some revision of our design based on staff experiences with the pilot. Our training sessions, as I just mentioned, went well beyond skill development. Many FERM members (ECFE staff) participated in the workshops even though they had no data collection responsibilities. Our review meeting was a day-long, very intensive and active working session. Staff discussed and debated findings and formulated their recommendations for inclusion in the report. Let’s discuss your methodology briefly. You used a purposeful sampling strategy. Tell us about why you chose this strategy. Random sampling did not fit our situation, and we also didn’t have the budget to implement random sampling effectively. We wanted to learn things about lowerincome families so we purposefully involved districts that had higher numbers of lower-income participants. We also had to work around very practical constraints. We wanted to study families new to the program, and we wanted to involve them within a couple of days after they enrolled to limit the amount of exposure they

92

Editor: Mueller:

Editor: Mueller:

Editor: Mueller:

AMERICAN JOURNAL OF EVALUATION, 19(l), 1998 had to the program before we collected base line information. This is another example of how staffs program intelligence really helped. Let me add a side bar here. Apparently it’s common for parents to pick up “ECFE language” pretty quickly-parents pick up on how parent educators talk with them and how they talk about children. So we had to move fast to get the base line data. We also wanted at least 100 families to complete the study and felt, given attrition, that we should begin with 150. We wanted families with incomes of less than $30,000 but fairly evenly distributed between three income sub-groups ($O-$9,999, $lO,OOO$19,999, and $20,000-$30,000). We also hoped our study families would mirror the ethnic diversity of low-income families in ECFE. To identify potential study families we asked all new ECFE families enrolling during the first six weeks of the fall to complete a survey. Site evaluators recruited four to six of the earliest survey respondents meeting study group criteria to participate. Why did you define “low income” as those with annual family incomes below $30,000? That’s pretty high. Median family incomes are close to that. A couple of points. We wanted to look at a range of low-income families. Part of it was practical. Income was broken down into six or seven categories on the enrollment form; we wanted the three lowest. Also, I should point out that we’re looking at a specific program in a particular state. I think the 1990 Census statewide median family income was around $37,000. The 7-county metro area median family income was around $41,000. When we look at study families, 72% had incomes less than $20,000. What was your refusal rate? Very few families refused. Some site evaluators had to work harder than others. For the most part, staff skills in working with families really helped here. In some cases, they had to look at the first ten families instead of the first four to six that met the selection criteria. For people who were reluctant to participate, the reason given was often not wanting someone to come to their home. So for some families, we did the interviews and videotaping at the center. We had a 79% completion rate. Families not completing the study typically dropped out because they moved or went back to work and were not participating in the program. You used a mix of surveys, interviews, and videotapes of parents and children to collect information. How did you decide to select these particular methods? The strategies we selected depended on the evaluation questions themselves and our desire to understand what happens to families from three perspectives-parents, staff and independent raters. For example, the survey gave us descriptive information on families and their enrollment patterns. We used interviews because we learned in the earlier study that interviews provided in-depth information about how parents talk about their children and their behavior-they were very useful in helping staff understand what families know. Staff also learn a lot about families by interviewing them. We used observations with notes and the videotapes as another means of collecting information about behavior. I was skeptical about using video cameras initially. I felt it would be intrusive. Betty used videotaping in her dissertation research, but I wasn’t convinced. Before we made the decision I visited a program where videotaping is used extensively with teen parents. The videotaping was an accepted part of that program’s environment. It didn’t appear intrusive at all. As a matter of fact, it struck me as a great way to

Exemplary Evaluations

Editor: Mueller:

Editor:

Mueller:

Editor: Mueller:

93

help parents understand their own parent-child interaction. When we started the study, very few districts were using videotaping or had video cameras so we spent time during the pilot learning how to use the technology and figuring out how and where to borrow cameras. What role did the FERM and others play in these decisions? FERM members were very involved in making decisions, and their reflections on past evaluations were helpful in shaping what we did. I would present different options to the group and FERM members would chose what we did. Staff comments and feedback from some academic reviewers also influenced how we structured the analysis. The first outcome study had been criticized because staff collected and analyzed data. Some academic reviewers for this study were extremely skeptical about staff involvement in data collection. To address these concerns, we built in the third perspective about parent behavior-independent ratings. We used Ph.D. candidates from the Institute of Child Development at the University of Minnesota, recommended by one of the reviewers, to rate the videotapes. These independent raters were experts in analyzing videotapes of parentchild interaction. In regard to your results, you note that you found that correlations between the staffs’ ratings of parents’ behaviors and the independent raters’ ratings of videotapes were statistically significant (r = .33 for fall, .27 for spring, .23 for change scores). Were these correlations as high as you expected? We hadn’t done correlations between ECFE staff and independent raters before so we had no idea of what to expect. I have a lot of confidence in staff assessments. However, it was important to prove to ourselves that they’re doing a quality job so we wanted to compare them with independent ratings. The staff and independent raters were using different measures. The staff assessments were based on their analysis of parent interviews, and the raters assessed videotapes of parentchild interaction, but they were measuring the same concept. It was of interest to us that the scores were in the same direction. The correlations were positive and significant. This suggests that although staff and independent raters used different scales, they rated parents’ behavior in similar ways. Which of your results do you think show the strongest support for the program? I think what we learned points to both strengths and weaknesses of the program. In terms of strengths, I would point to a mix of things rather than just one finding. First, low-income families voluntarily participate in this non-targeted program58% of new enrollees at study sites were low-income. Second, ECFE attracts a diverse group of low-income families in terms of prior knowledge of child development and parenting skills, demographic characteristics, number of risk factors and social support. Third, parents are very satisfied with the program and report changes in their parenting skills and in their children. For example, 92% of parents reported positive differences in their awareness and understanding of children and child development, in their confidence as a parent, and in social support. Seventy-two percent reported improvements in how they relate to their child. Changes in children’s behavior reported by parents included increased independence, improved language and communication skills, improved relationships with other children and more self-confidence. Finally, ECFE is helping parents learn more about child development and parenting-according to staff assessments,

94

Editor: Mueller:

Editor: Mueller:

Editor:

Mueller:

Editor:

Mueller:

Editor:

Mueller:

AMERICAN JOURNAL OF EVALUATION, 19(l), 1998 34% demonstrated positive change in their knowledge and awareness of their own child. (Editor’s Note: Low-income families certainly do participate in the program; however, sites that were more likely to have low-income families were selected in order to examine program effects on this group. So, the 58% low-income is probably not representative of all sites. And, some of those “low income” families have incomes quite close to the median.) Let me ask you, which results do you feel are the weakest regarding program outcomes? The piece that provides the most pause for thought is that knowledge change exceeds behavior change. That’s an important piece, and the piece people talk about. What’s your view on this? We spent a lot of time talking about this, and staff recommended that districts review all educational methods and look for ways to help parents observe, practice and reflect on parenting behavior and their parent-child relationships. They are acting on this recommendation. What’s interesting about this finding are the different interpretations made by different audiences. Legislators typically feel the percentage of families changing their behavior (as assessed by staff or independent raters) should be much higher. Academics and professionals working in family support and education programs are more apt to understand the tenacity of parenting behaviors and the difficulty of changing patterns that are formed over many years. One of our reviewers, an expert in child development, predicted we would find limited, if any, change in parenting behavior after 6-8 months of program exposure. Let me ask a little bit about how you analyzed the knowledge and behavior changes. You report most of your results in terms of percentage changes. How did you judge your results? That is, how did you determine whether the amount of change was sufficient? Did you consider using statistical tests or setting explicit standards? Let me take your last question first. No, we did not consider statistical tests of significance to assess change findings. Given our design it would have been inappropriate to do so. How we analyzed knowledge and behavior change is briefly described in the summary and explained in detail in the full report. Look carefully at the statement of themes and findings. They are descriptive rather than evaluative, which is appropriate. But, you do talk about change. You describe changes that occurred. The implication is that these are meaningful. Did you set some criteria for the types of changes you expected? We were looking to see what would happen. The evaluation was for program staff and program improvement. It was sufficient in this case to look and be fairly descriptive about it. One of your issues of concern was the appropriateness of universal access programs for low-income families. You conclude that universal access is effective. On what basis did you draw that conclusion? Let me clarify the universal access statement. The theme actually reads “ECFE’s universal access approach is effective with many different low-income families.”

95

Exemplary Evaluations

Editor:

Mueller:

In essence, we’re saying this non-targeted approach is robust. One of our analysis questions was, who benefits? We wanted to see if we could identify a set of characteristics common to those parents demonstrating change. We couldn’t. Families that demonstrated change are diverse. Characteristics of parents we know something about accounted for very little of the variance in parent score change for either staff or independent ratings. So, you’re basing your conclusion on the effectiveness of universal access on the fact that the demographic variables in the regression were not significant. But, the regression also showed that hours of participation was not significant. So, how can you conclude it works? A couple of points on demographics and participation: As I note in the report, the finding of no difference by demographic sub-group is not unique to this evaluation. Other evaluations have found that although parents demonstrate change, outcomes could not be predicted by demographic attributes of participants. The participation finding is consistent with what we’ve learned about adults in other learning programs that are voluntary and flexible-people learn at different rates and have different reasons for coming, staying, or leaving. In this study, the distribution for hours of exposure was quite flat. The average was 42 hours, but the range was 8 to 126 hours. This is not unusual in a program that is voluntary and flexible.

(Editor’s Note: The lack of effect for participation may be due more to the dificulty in changing parenting

behavior with a relatively limited amount of contact,

e.g., two hours a week. As Mueller notes, one of their reviewers predicted if any change

in behavior.

The large range of participation

limited,

would, if anything,

make it more easy to see an effect for participation.)

Editor:

Mueller:

On a related issue, you note that for families with moderate and high level skills the program maintained their skills. Is maintaining their skills sufficient? Should the program focus only on families with low-level skills? On a simple level, it is helpful to know that a program doesn’t harm people; nobody wants to fund, administer or participate in a program that demonstrates negative change. So, is maintaining skills sufficient? First let me explain what we mean by low, moderate and high. For staff assessments, low scores reflect developmentally inappropriate knowledge of child development or uncertainty about what one does as a parent or uncertainty about their child’s behavior. Moderate scores were assigned to parent responses demonstrating basic and appropriate child development knowledge or parenting behavior. A high score was assigned when the parent response reflected a relatively sophisticated understanding of child development and parenting skill. High scores reflect integration of developmentally appropriate knowledge and awareness of their child’s behavior in relation to themselves and others. ECFEprograms work toward and support parent development at the medium and high levels. In our analysis we were most concerned about the lowest group and whether positive movement was demonstrated. OK, should the program focus only on families with low-skill levels? Segregation by skill level is not part of ECFE’s philosophy. One reason for using non-targeted approaches is to avoid the stigma attached to labels. ECFE’s philosophy includes the notion that all families with young children can benefit. A comerstone of ECFE’s approach is based, in part, on the assumption that parent-child development is supported not only by staff (professional family educators) but by

96

Editor:

Mueller:

Editor: Mueller: Editor: Mueller:

Editor:

AMERICAN JOURNAL OF EVALUATION, 19(l), 1998 other parents with young children. Parents coming with limited or inappropriate child development knowledge and low levels of parenting behavior interact with parents exhibiting moderate or high skill levels. Parents not only hear from family educators about appropriate practices but interact with and learn from their peers-parents who demonstrate those skills. Is this evaluation advocating ECFE’s universal access approach over targeted strategies? No. What this evaluation does do is provide some information about what happens to low-income families in a non-targeted family support and education program. This is useful information. Most family support and early childhood programs are targeted to low-income families. Universal access approaches are not typically considered; they are not an obvious choice. Let me move to the dissemination of your results. One of the advantages of on-site evaluators concerns the ease in communicating results. How were your results communicated and used on-site? Did you train the evaluators in means for communicating results? Do you think they were more or less successful than an external evaluator would have been? Let’s look at communication in two ways-dissemination externally and communication within ECFE related to application. I mentioned dissemination in my summary. We started dissemination efforts during the pilot phase. With site evaluators and FERM members I made a conscious effort to provide them with different ways of talking about the study every time we met. For example, I had a brief l-2 page update on the study that we talked about at each workshop. I also included a two-page summary of the evaluation plan in each handbook. At every meeting we talked about what we were doing and why. Our workshop sessions were communication sessions. There was a lot of internal communication there. But, we didn’t specifically talk about how to communicate on-site. They had a lot of materials and had to talk to other staff in their districts as well as state and local decision-makers about what they were doing. What’s most important I think, given the capacity-building purpose of the study, is what staff are doing to share and apply what they learned with other districts not in the study. I talked about some of that in my summary. Site evaluators took the initiative to plan and conduct workshops and are serving as consultants to their colleagues (in other districts). There is a real emphasis on observation skills for parents and staff, pieces of the evaluation methodology are making their way into programming. Betty and I did not plan this and I’m not involved. It’s neat-it’s a great example of process use-I’m really excited about it. Would this kind of application happen if the evaluation was conducted entirely by external evaluators? I doubt it, no. Did you visit the sites? A few in the metro area. Our budget did not allow for extensive travel. What prompted your visits? To some extent, curiosity. I like to have a sense for what’s going on. One visit occurred during the planning phase, and I wanted to observe how that site used video cameras. Could you describe to us the site which made the “best” use of the evaluation? How did they use it? What enabled them to be the “best” site in this regard?

Exemplary Evaluations Mueller:

Editor: Mueller:

Editor:

Mueller:

97

All sites were active. I honestly cannot tell you specifically who has used it more than another. From what I’ve seen and heard, they’re making exceptional use of the information. How did you monitor what went on on-site? Our budget didn’t allow for on-site monitoring. What we did do was rely on the workshop process to make sure staff members were clear about what needed doing and why, and our auditing process to monitor data quality. We also had the site evaluators keep process notes-filling out their reaction to issues-how parents responded, what worked well, what didn’t. They kept track of their emerging conclusions and their initial hunches. After they finished data collection, we had them complete a survey which asked for their evaluation of the process. I asked what they would recommend doing differently. They indicated they would all do it again, that they had learned more about the families and about themselves. The down side they cited was the time factor. We pushed the limits of what could be done. There were obviously many strengths to your study, but all evaluations have weaknesses. If you were to do this evaluation again, what would you do differently? Oh my, there are always things you would like to change or fine tune. One weakness, which was brought out in the recommendations, ultimately comes back to sampling. The recommendation noted that ECFE needs to learn more about how they can better serve families of color. In the study, the ethnic mix was not what we expected. We needed a different way of engaging families of color in the study, particularly African American families. Also, there are questions we wanted to get at but couldn’t. The most important of those was, “What do we know about child outcomes?’ We are trying some interesting approaches to child assessment (Work Sampling System developed by Samuel Meisels from the University of Michigan) in Minnesota’s Learning Readiness and First Grade Preparedness Programs, but there is no way we could have incorporated work sampling into the evaluation plan since it is as labor intensive as the work we did. With regard to the capacity-building purpose of this evaluation, it is interesting to think about what could be done differently. This project required a major, major investment of staff time. Site evaluators received a nominal honorarium, but it in no way covered their contributions. The evaluation process and findings are being used and applied, the payback is pretty good-but I think there may be smarter ways (more efficient ways) to combine evaluation and capacity-building purposes. You know, if ECFE got a million dollar grant for evaluation tomorrow I probably would not recommend that they do a million dollar study. I would probably suggest they consider multiple projects, involving more staff but in less complex projects. They have an extensive agenda (evaluation agenda) and the iterative, single project approach seems really slow.

Editor’s Commentary:

The Minnesota Early Childhood Family Education program evaluation presents a nice contrast to the first evaluation reviewed in this series. This evaluation is formative in intent and actively involves staff in planning and data collection. Exemplary elements include the active involvement of FERM members in the evaluation, the use of site personnel for data collection and exploration of results, the development of and experimenta-

98

AMERICAN JOURNAL OF EVALUATION, 19(l), 1998

tion with innovative outcome measures (the SRI), and the obvious impact this evaluation, and others in the series, have had on capacity-building in the organization. It appears that the evaluation and the organizational environment have encouraged staff to contribute to a true learning environment, looking at their processes and outcomes and experimenting with different approaches. While this evaluation

has many noteworthy

qualities, I wanted to comment on two con-

cerns. First, I am confused about the discussion of changes and outcomes in both our interview and the report without reference to any standards, criteria, or statistical tests. This study relies, instead, on percentages to report the observed changes, often in ways that suggest rather large effects. Thus, Mueller notes that “the number of parents receiving low ratings on the Parent Behavior Rating Scale decreased by 27%. ” This finding is also noted prominently in the summary report. Further reading of the full report shows that the 33% of parents who scored low on the pre-program measure was reduced to 25% on the postprogram measure. This is a 27% reduction; however, in all, only 8% of parents showed a change in behavior from pre-program to post-program measures. This latter finding might have been a more appropriate summary statistic to emphasize if one is relying on percentages. Whether these changes are meaningful, in terms of program standards, or due to more than chance cannot be determined. While the author discusses the purpose as descriptive, I am concerned that many readers would not come to that conclusion when reading about “outcome themes. ” While the author notes a different design would have been required to use statistics, statistics and/or standards are frequently used in the literature to examine the import of observed changes. Randomized designs can be used to establish causality, but statistical signi$cance is frequently explored to identify relationships. Further, the authors go on to use regression analyses and statistical significance of those analyses to assess the universal access component. A combination of standards and tests of statistical significance would have been useful to help readers judge the merits of the described changes. My second concern relates to the extent of interaction between the author and the site evaluators. As Mueller states, they were certainly able to do much more in this study due to the staffs involvement in data collection. And, I have no doubt that their involvement in the data collection itself contributed to a change in the organization and their learning. Further, the FERM members, some of whom were site evaluators, were actively involved in the planning of the study. These elements should be actively considered by other evaluators. However, I was somewhat surprised at the author’s limited awareness of variation across sites and their (possibly related) lack of opportunity to visit the sites, nine of which were in the metropolitan area. There seemed to be a distance between the program and data collection, on the one hand, and the administration, interpretation, and reporting of the evaluation, on the other, that I had assumed was not typical of participatory evaluations. It is critical to observe the programs we evaluate in action and, particularly when using on-site data collectors, to observe staff engaged in data collection in order to have a fill understanding of both the programs of interest and the evaluation itself Nevertheless, Mueller’s efforts have enlightened us and the child development community. She and her colleagues have greatly enriched the organizational capacity of ECFE sites through their work. In this way, the evaluation will have a far more long-lasting effect than the results of this one study.

99

Exemplaty Evaluations

ACKNOWLEDGMENTS This evaluation was supported by a grant from The McKnight Foundation to Family Resources of Minnesota (FERM) and funds appropriated by the Minnesota legislature. I would like to commend the fifty ECFE professionals involved in the study: Dr. Betty Cooke; Lois Engstrom with the Minnesota Department of Children, Families and Learning; members of FERM, and the 28 district staff. This group demonstrated the important contributions users bring to the evaluation enterprise.

NOTE 1. For copies of the evaluation report or more information about ECFE, contact Betty Cooke, Early Childhood and Family Initiatives Specialist or Lois Engstrom, Supervisor of Early Childhood and Family Initiatives, Minnesota Department of Children, Families and Learning, 992 Capitol Square Building, 550 Cedar Street, St. Paul, MN 55101. Tel: (612) 296-8414, E-Mail: [email protected].

REFERENCES Cook, T. D. (1997). Lessons learned in evaluation over the past 2.5 years. In E. Chelimsky & W. R. Shadish (Eds.), Evaluation for the 21st century. Thousand Oaks, CA: Sage. Cooke, B. (1992). Changing times, changing families: Minnesota early childhoodfamily education parent outcome interview study. St. Paul, MN: Minnesota Department of Education. Minnesota Department of Education (1986, March 1). Evaluation study of early childhood family education-Report to the legislature. St. Paul, MN: Minnesota Department of Education. Mueller, M. R. (1996a). Immediate outcomes of lower income participants in Minnesota’s universal access early childhood family education. St. Paul, MN: Department of Children, Families and Learning. (1996b, August) Observations: Second year follow-upchanging times, changing families. -. Phase II. Unpublished briefing report to Family Education Resources of Minnesota. Minneapolis. Minnesota. . (1995). Changing times, changing families II-Spring evaluation guide. St. Paul, MN: Depart ment of Children, Families and Learning. (1994). Changing times, changing families II-Fall evaluation guide. St. Paul, MN: Depart-. ment of Children, Families and Learning. Patton, M. Q. (1997). Utilization-focused evaluation, The new century text. Thousand Oaks, California: Sage. Weiss, H. & Halpem R. (1990). Community-basedfamily support and education programs: Something old or something new? New York: National Center for Children in Poverty, Columbia University.