Evaluation and PFogramPlanning.Vol. 17,NO.4, pp. 381-390.1994 0 1994ElseviaScience Ltd Printedin theUSA.All rightsIZSCXV~~ 0149-7189/94$6.00+ .@I
Pergarnon
0149-7189(94)00033-6
EVALUATING SERVICES DEMONSTRATION PROGRAMS: A Multistage Approach WIL,L,L+M E.
SCHLENGER,
E.
JOYCE
ROLAND, LARRY A. KROUTIL, and MICHAELL. DENNIS
Centerfor Social Research and Policy Analysis, Research Triangle Institute
KATHR~N M. MAGRUDER Service Research Branch, National Institute of Mental Health
BARBARA A. RAY Office of Applied Studies, Sustance Abuse and Mental Health Services Administration
ABSTRACT This paper describes the rationale for and an example of a multistage approach to evaluating services demonstration ptvgrams that takes account of the important &sign limitations often found in such pmgrams. These 1imitMons may include one or more of the following: (a) the lack of control or comparison groups - i.e., there is no requirement for experimental or quasi-experimental design; (b) the lack of a common intervention - i.e., the demonstrations involve multiple projects, each of which is implementing a different intervention; and (c) the lack of common dam collection structure across projects. We view the evaluation of services demonstration programs as one step in the broader process through which new interventions are conceived, tested, and ultimately disseminated. We describe an approach to the evaluation of services demonstrations that provides an empirical basis for identifying those projects that appear ‘promising” (i.e., appear to be fulfilling the demonstration’s objectives), and then describes a subset of promising projects in detail. Findings from such an evaluation can provide the basis for moving into the next phase, in which the effectiveness of one or more promising models is tested experimentally. As an example of how this approach can be applied, we describe the design of the National Evaluation of Models for Linking Drug Abuse Treatment and Primary Care, the evaluation of a federally-fiutded services demonstration that was aimed at examining alternative strategies for improving the linkage between the drug abuse treatment and primary care systems.
The work described in this manuscript was supported in part by Contract No. 282-88-0019/3 from the Public Health Service (PHS), Contract No. 28399-0001 from the National Institute on Drug Abuse (NIDA), and Grant No. P50-DA06990 from NIDA. The authors gratefully acknowledge the important contributions of Sander Genser, MD, the NIDA project officer for the evaluation; Howard Lemer, the Health Resources and Services Administration (HRSA) liaison with the evaluation; and the directors, staffs, and clients of the 21 Linkage Demonstration projects that participated in the National Evaluation. We also appreciate the thoughtful critiques by two anonymous peer reviewers of an earlier draft of this article. Requests for reprints should bc sent to Dr. William E. Schlenger at Research Triangle Institute, PO. Box 12194, Research Triangle Park, NC 27709-2194. 381
382
WILLIAM E. SCHLENGER et al.
With the adoption of the “New Federalism” by the Reagan Administration in the early 198Os, the Federal role in the delivery of a variety of social services to disadvantaged individuals changed dramatically. Prior to that time, the Federal Government supported the provision of a wide variety of health, mental health, and substance abuse treatment services directly, through grants and contracts with providers across the country. The New Federalism, however, transferred responsibility for decisions about service provision to the states, with funds provided by the Federal Government through the block grant mechanism. During the 198Os, however, many of the laws enacted by Congress aimed at addressing the Nation’s social problems included language that emphasized an altemative role for the Federal Government in the services arena. This role involved support of demonstration programs aimed at testing innovative ways of improving or enhancing the delivery of services. In these programs, which we will refer to as services demonstration programs, the Government issues a Request for Applications (RFA) that invites proposals for projects that demonstrate novel approaches to some specified problem (e.g., improved drug abuse treatment). Services demonstration programs are intended as vehicles through which innovative approaches to psychosocial treatment, case management, andfor services integration can be tried out in realworld settings, and are expected to provide an empirical basis for motivating the dissemination of successful enhancements and innovations across the Nation. With the new emphasis on services demonstration programs, the Federal Government’s role changed from one of providing services to one of generating information that would help improve the provision of services. The growth of this new role is exemplified by the Substance Abuse and Mental Health Services Administration (SAMHSA), a new agency within the Public Health Service (PHS) that was created on October 1, 1992, by the legislation (Public Law 102-321) that separated the service delivery and research functions of the former Alcohol, Drug Abuse, and Mental Health Administration (ADAMHA). One important purpose of this separation was to enhance the Federal leadership role in improving the delivery of mental health and substance abuse services. The separation was accomplished by creating SAMHSA to administer the service delivery functions related to mental health and substance abuse (demonstration programs and block grants), and by transferring into the National Institutes of Health (NIH) the three ADAMHA research institutes - the National Institute of Mental Health (NIMH), the National Institute on Drug Abuse (NIDA), and the National Institute on Alcohol Abuse and Alcoholism (NIAAA). During Fiscal Year 1993, SAMHSA provided grants through more than a dozen major services demonstration programs in the
mental health and substance abuse fields, with total support of over $360 million per year (Federal Budget, 1993). Thus in the mental health and substance abuse fields alone, Federal support of services demonstrations is big business. Given that substantial resources are being devoted to services demonstration programs, it is important to maximize what is learned from the experiences of the demonstration grantees and to disseminate the lessons learned to those who may be able to profit from them. Consequently, it is clear that the return-on-investment from such programs depends on well-designed, well-executed, and well-disseminated evaluations.
BACKGROUND: COMPLICATIONS IN EVALUATING DEMONSTRATION PROGRAMS In this paper, we wish to focus on one of these important elements: evaluation design. We focus on the design element because many of the services demonstration programs that have been funded by SAMHSA and other agencies in recent years have characteristics that limit substantially the conclusions that can be drawn from the evaluations of them. More general reviews and descriptions of approaches to implementation evaluation have been provided elsewhere (Brekke, 1987; Dennis, 1990; Gray, 1986; Hall & Loucks, 1977; Lebow, 1982; Madaus, Striven, & Stufflebeam, 1983; Rossi & Freeman, 1985; Scheirer, 1986; Scheirer & Rezmovic, 1983; Scheirer, 1994). Our purpose here is to focus on the specific issues in evaluating and benefiting from the implementation of Federally funded services demonstration programs. Although the constraints may vary from demonstration to demonstration, many (though not all) of the current services demonstration programs funded through SAMHSA share one or more of three characteristics that have important implications for evaluation design. These are:
l
l
Lack of control or comparison groups - i.e., there is no specific requirement in the demonstration for experimental or quasi-experimental design. Rather, each grantee in the demonstration simply implements whatever intervention it proposed to demonstrate, without regard to a “comparison” or “control” condition. Lack of a common intervention or protocol - i.e., each grantee in the demonstration implements whatever intervention it proposed in whatever way it proposed, so a demonstration with N grantees usually involves at least that many different interventions. This is in contrast to a multisite collaborative study, in which each participating site implements one or more common interventions using a standard protocol.
Evaluating Services Demonstrations l
Lack of a common data collection structure. Although each grantee typically may collect data about its demonstration project, there is typically no mechanism to assure consistency in the nature of the data collected or the procedures by which they are collected.
The lack of a requirement for experimental or quasiexperimental design often forces the evaluator into a pre/post or time series design, and limits the extent to which causal attributions can be made. Lack of a common intervention means that the demonstration is in essence a set of independent, N = 1 intervention studies, reducing the evaluator’s ability to make generalizations about any specific intervention. Consequently, analytic interest is shifted away from the mean (e.g., what is the average effect?) and toward separate examination of each individual site’s performance. Lack of common data collection procedures makes it difficult to combine data across demonstration sites, limiting external validity, and may also force the evaluator into a post-only design. As a further complication, these kinds of services demonstration programs have often included multiple evaluation requirements. Grantees are typically responsible for conducting an evaluation of their own project and also for participating in a “national evaluation” whose design is not specified in advance for prospective applicants. When these and other constraints are present, it is clear that evaluations of the demonstration programs will not provide definitive assessments of the effectiveness of the interventions that are tested. Nevertheless, they can produce useful information that advances the state of knowledge about those interventions. The challenge to the evaluator is to design an evaluation that: (a) addresses the substantive objectives of the demonstration, and (b) maximizes the knowledge that can be extracted from the experiences of demonstration grantees, given the constraints. In our view, the evaluation of services demonstration programs should be seen as one step in the broader process through which new interventions are conceived, assessed for efficacy (i.e., does the intervention produce the desired outcome under carefully controlled conditions?), assessed for effectiveness (i.e., does it work under real-world conditions?), and ultimately disseminated. Although they will not provide definitive evidence about intervention efficacy or effectiveness, carefully designed evaluations of services demonstration programs that have one or more of the limitations described above can provide useful descriptive information about new psychosocial interventions and how they can be applied in real-world settings. We view the demonstration program mechanism as a fertile field for determining the feasibility of new interventions, for providing descriptive information about their implementation in specific set
383
tings, and for providing preliminary, descriptive evidence about potential effectiveness. In the remainder of this paper, we describe the evaluation design for a major Federal service delivery demonstration that had all three of the problemmatic characteristics cited above. We offer the design of this evaluation as an example of an evaluation that takes account of both the objectives of the demonstration and the constraints imposed by the demonstration’s characteristics. We begin by providing some descriptive information about the demonstration and its objectives and then turn to the design.
THE ADAMI-WHRSA LINKAGE DEMONSTRATION The continuing acquired immune deficiency syndrome (AIDS) epidemic has made painfully apparent the separation of the existing systems through which drug abuse treatment and primary care services are delivered in the United States. Because they engage in a variety of highrisk behaviors, drug users put themselves, their sexual partners, and their children at risk for exposure to infection with the human immunodeficiency virus (HIV) and therefore, ultimately, to the development of AIDS and to a variety of other infectious diseases (e.g., other sexually transmitted diseases, hepatitis B, tuberculosis). Drug users are at higher risk of being infected with these diseases and passing them on to other people through a variety of mechanisms, including: needle sharing, trading sex for drugs, having sex with multiple partners, and perinatal transmission. Drug users are therefore a group that is both in need of direct treatment for substance abuse and health care problems and an ideal target for preventive public health efforts. Recognition of the important role of drug use in the spread of AIDS has therefore focused attention on the relationship between the drug abuse treatment and primary care delivery systems. Groups representing both the drug abuse treatment and primary care communities have recently advocated a more integrated approach to meeting the multiple service needs of drug users (Association of State and Territorial Health Officers, 1988; Health Resources and Services Administration Consultant Workgroup, 1989; Matheny, 1989; Ray, 1988). Professionals representing both perspectives agree that people should be assessed and treated for the full range of their problems regardless of where they “enter” the treatment system. In addition, many have advocated expanding the focus of care to include the sexual partners and families of drug users, since doing so might represent an effective form of outreach to persons who may not be receiving care through either system.
384
WILLIAM E. SCHLENGER et al.
There are several reasons why the need for improved linkage between substance abuse treatment has recently become more widely recognized. First, awareness of the fact that substance abuse problems and physical health problems frequently co-occur (Haverkos & Lange, 1990) has grown. This awareness has been facilitated by the recognition of the role of intravenous drug use in the spread of HIV infection, and consequently in the development of AIDS (Turner, Miller, & Moses, 1989). Second, the increased recognition of the co-occurrence of substance abuse and physical health problems has served to highlight the relative independence of the systems of care through which these problems are treated. Thus the changing clinical demands of practice in both substance abuse treatment and primary care have made the problems caused by separate systems of care painfully acute. Third, epidemiologic evidence suggesting that many substance abusers do not seek treatment for their substance abuse problems (Shapiro et al., 1984) has underscored the need for active efforts to identify substance abusers and recruit them into treatment. This has led to the emphasis on the potential role of primary care providers in the identification and treatment of substance abuse. Thus, improved linkage has been identified as a potential mechanism through which: (a) the service needs of today’s substance abuse treatment clients can be more comprehensively addressed - a direct treatment objective, and (b) untreated substance abusers can be identified and treated early - a secondary prevention objective. To foster the development of models for improving the linkage between drug abuse treatment and primary care, an interagency agreement was established in 1989 between the Federal agencies that had primary responsibility for substance abuse and primary care service delivery - the Alcohol, Drug Abuse, and Mental Health Admnistration (ADAMHA) and the Health Resources and Services Administration (HRSA). Under this agreement, ADAMHA transferred funds to HRSA to underwrite a demonstration program in which alternative approaches to “linked” drug abuse treatment and primary care could be tried out. The demonstration, which began in 1989, was to run for 3 years, at a cost of $9 million per year. Additionally, ADAMHA agreed to fund an evaluation of the demonstration, which would focus on identifying and describing promising linkage models. The goals of the ADAMHA/HRSA linkage demonstration program, as specified in the RFA, were to implement service systems that could: l
l
Recognize and treat substance care system. Recognize and treat substance with health care problems.
abusers in the primary abuse treatment
clients
l
Develop and demonstrate the feasibility of establishing linked systems of care in communities with demonstrated service needs.
The RFA did not require applicants to include experimental or quasi-experimental comparisons, did not specify a common intervention, and included no specified data collection mechanisms. It did require applicants to describe their plans for evaluating their own projects, and to “cooperate” with a “national” (i.e., cross-site) evaluation. A total of 101 applications were received in response to the RFA. These applications were reviewed by four panels of peers, and 21 applications were selected to receive demonstration grants. Although the projects selected were diverse, all were focused on the public health system (e.g., publicly-funded drug treatment programs, community health centers, county departments of health) and included no private sector programs. Thus the rationale underlying the demonstration involved recognition that the independence of the drug treatment and primary care systems was problematic and that there was to date little experience in testing ways to overcome that independence. Therefore, it seemed clear that what was needed was a mechanism through which grantees could try out a variety of approaches to improving the linkage between substance abuse treatment and primary care. The program was conceived of as a demonstration, rather than as a multisite effectiveness study, because: (a) it seemed clear that there would be more than one way of achieving improved linkage, and (b) there was no single approach that was a candidate for testing in a controlled, multisite trial. Consequently, the primary objective of the demonstration was to support implementation of a variety of approaches to linkage.
EVALUATION
DESIGN
The primary objective of the evaluation, on the other hand, was to identify and describe one or more “promising” models of linkage. These models could then be disseminated to others who were trying to address the problem of improving the delivery of drug abuse treatment and primary care and would be candidates for further testing under more controlled conditions in subsequent studies. Thus the evaluation design team set out to develop a design through which we could identify and describe promising models from among the approaches taken by the 21 demonstration grantees. The basic design (Schlenger, Dennis, & Magruder-Habib, 1990) called for a two-phase approach. The first phase focused on identifying from among the 21 grantees a subset that represented “promising models,” and the second phase involved
Evaluating Services Demonstrations developing a detailed description of each of the promising models. Therefore, we would first need a mechanism for identifying promising models and then a mechanism for describing them. Phase One: Identifying Promising Models But how would we know a promising model if we saw one? The major elements of our logic were as follows. First, it is clear that before an approach could be judged to be promising, it must first have been adequately implemented. Over the past few decades evaluators and services researchers have learned on the basis of hard experience that empirical assessment of implementation must be the first step in a comprehensive evaluation (Brekke, 1987; Madaus et al., 1983; Scheirer & Rezmovic, 1983). Implementation evaluations provide systematic feedback on the extent to which a proposed program or intervention has in fact been carried out i.e., whether staff and other necessary resources are in place, whether planned services and activities are being carried out, and whether the intended population is being served. The basic idea is straightforward: before evaluating the “effectiveness” or other aspects of an intervention, it is important to assess whether that intervention has in fact been applied. Thus, our basis for identifying promising approaches was to be on actual performance, not on conceptualization or assumption. But how could we recognize that an approach to linkage had been implemented? The underlying idea of “linkage” recognizes two important current realities of human service delivery: (a) that drug abuse treatment and primary care are currently delivered by two largely separate systems, and (b) that many people have both substance abuse problems and health care problems. Linkage projects attempt to overcome this separation of the systems of care so that the full spectrum of service needs of people who need both types of care are recognized and comprehensively treated any time a person comes into contact with either system. In principle, then, an appropriate “gold standard” for judging the implementation of linkage projects might be the following: drug abuse treatment and primary care can be said to be “linked” when all people who seek care in either system - i.e., whether they come in to a neighborhood health center or a community-based drug treatment center - are comprehensively assessed, appropriately served, and conscientiously followed up. By “comprehensively assessed,” we mean that all patients/clients routinely undergo an assessment and screening process that systematically checks both for substance abuse problems and for health problems. By “appropriately served,” we mean that all problems that are identified in the assessment process are addressed using protocols that reflect the current state of the art. Finally, by emphasizing “conscientious follow-up” we acknowledge that substance abuse is
385
a chronic problem (as are many health problems) whose treatment is likely to be episodic, thereby requiring a systematic plan for continuing oversight and reevaluation. Thus we opted for an empirically-based definition of “promising” that focused on the delivery of services. The focus on delivery of services, rather than the outcome of those services, was appropriate for two reasons. First, the interventions being tested in the demonstration were system-level interventions, whose hoped-for impact was improved service delivery. That is, grantees were not testing new treatments; rather, they were testing new organizational arrangements that might help them provide existing services more comprehensively. Second, the lack of experimental design made the demonstration unsuitable for evaluations of effectiveness. That is, the demonstration was aimed at trying out ways of improving the delivery of existing services, rather than testing the effectiveness of new types of service. Given the focus on service delivery as the criterion for identifying promising approaches, our evaluation questions for the first phase were also service focused. They included: l l l
l
Who is being served by the demonstration grantees? What services are they receiving? How many clients are receiving both drug abuse treatment and primary care services? What linkage models have been implemented by the demonstration grantees?
Answering these questions, of course, requires data. We used data from two sources. First, all grantees in the demonstration were required to submit quarterly reports of their activities to HRSA so that it could perform its oversight function. We looked at these reports as one source of information but were concerned over the potential for self-report bias. We felt, however, that grantee self-report data could provide a useful basis for classifying grantees in an a priori taxonomy of linkage approaches that the evaluation team developed. This taxonomy was based on the perspective of the service user and identified four different models: (a) centralized, where drug treatment and primary care services are offered at a single location [“one-stop shopping”]; (b) decentralized, where primary care and drug treatment services are offered at different locations; (c) mixed, where a limited number of primary care and drug services are offered at one site, but most services are delivered at other sites; and (d) transitional, in which the location where services are delivered purposefully changes over the user’s treatment history. In addition to the quarterly reports, however, we still needed a source of descriptive information about the clients who were being served by the grantees that could provide an empirical basis for identifying promising
386
WILLIAM E. SCHLENGER et al.
approaches. Therefore, in collaboration with the grantees we identified a second data source: a client-level “core data set” (Schlenger et al., 1990). This core data set was conceived of as basic information about client characteristics and services received that is typically maintained in service provider records. The idea was to identify a common set of items that were already included in the records of all demonstration sites and that would provide information relevant to the evaluation questions specified above. If we had such information about all of the clients served by the demonstration grantees, we could answer the major evaluation questions through simple tabulations. Also, since this information was already being collected and recorded by the sites, it could be abstracted directly from grantee records, thereby reducing the data collection burden on the grantees and their patients. Working collaboratively with the grantees, we were able to come to agreement on a core data set that met these criteria. The core data set included information about five major topics of interest to the demonstration: (a) sociodemographic characteristics, (b) substance abuse history, (c) medical history, (d) service needs, and (e) service utilization. We also developed a record abstraction form that covered the full data set so that an abstractor could go
LIFETIME USE OF SPECIFIC SUBSTANCES
Unweighted N
to the records of a given grant project and abstract the full core data set. We hired independent abstracters at each of the 21 grantee sites to abstract the information from grantee records for all patients served by the grantee during the first year of the grant. The abstraction effort yielded core data set information about more than 2400 clients who were served by the demonstration grantees during the first grant year. These data served as the basis of a report aimed at answering the basic evaluation questions (Schlenger, Kroutil, Roland, dz Dennis, 1992a) and as the primary basis for the identification of promising approaches. Findings suggested that: (a) most of the grantees had implemented system changes that improved the linkage between drug abuse treatment and primary care services, although the intervention that was actually implemented was often different in important ways from the “intended” intervention that had been described in the application; (b) that many clients were being served and receiving a variety of services; and (c) that a variety of approaches to linkage were represented in the demonstration. Table 1 shows an example tabulation of data from the core data set. As noted above, since aggregation across grantees is less meaningful in the context of a demonstration program, findings for each grantee are shown sepa-
TABLE 1 BY SUBSTANCE AND SITE
Alcohol
ABUSING
cocaine
CLIENTS, BY LINKAGE MODEL
Lifetime Use, % MariiUaWl
Heroin
Other
Centralized: Dept. of I-IRS Economic Oppor. CHC Mont&ore Med. Ctr. Multnomah Co. Seattle-King Co. Sunset Park CHC DetXdXTlliZ~ Charles Drew CHC Childrenr,Hospital Maricopa CQ. Samud U. Badgers CHC Southside CHC
82.7 83.1 38.0 90.0 92.6 86.5
67.6 68.5 83.5 56.3 38.9 95.5
68.3 66.2 27.8 72.6 35.2 75.3
8.6 4.6 79.7 43.7 9.3 46.1
35.3 16.9 51.9 62.1 24.1 44.9
83.0 82.2 96.1 73.7 51.1
57.0 40.4 98.4 52.6 78.7
63.0 80.8 87.5 26.3 40.4
9.0 8.5 97.7 15.8 6.4
35.0 75.1 89.1 26.3 6.4
101
21.3 92.0 9.1 44.6
37.5 90.5 57.6 68.3
10.0 86.4 19.2 23.8
96.3 100.0 66.7 79.2
36.3 82.9 27.3 21.8
89 122
56.2 91.8
89.9 51.6
36.0 96.7
24.7 15.6
58.4 91.0
184 98
84.8 57.1 76.3 75.8
70.7 56.1 34.2 85.3
33.2 26.5 18.4 54.7
41.8 90.8 55.3 73.7
18.5 25.5 42.1 61.1
74.1
66.9
67.2
47.2
49.5
a: 128 38 47
Mixed: City of Detroit Denver City/CounQ Desire Narcotics Erie Family Health CHC
80 199
99
Metro-Dade Co. Univ. of Texas Multiple Models: Fenway CHC Great Brook Valley CHC San Fran&co City/co. state of IL.
E 2312
TcYrAL Source
:
Schlenger
et al.,
1992a
Evaluating Services Demonstrations
rately. Grantees according to an a model they were the table provides way that facilitates
are, however, arranged in the table priori assessment of the type of linkage attempting to implement. As a result, site-specific findings but does so in a interpretation (Yin, 1984).
Phase Two: Describing Promising Models On the basis of detailed review of the descriptive data about clients and services that we abstracted from program records, we selected a subset of grantees as “promising models” whose approach and activities would be described in more detail. The basic purpose of this phase was to document the experiences of “promising” projects so that others attempting to improve linkage in their own communities could benefit. We limited the sample to nine so that the case study phase would not be subject to regulations of the Office of Management and Budget (OMB) that cover all federally-mandated collections of data from more than nine subjects. Selecting Projects To Be Studied. But how would we recognize a promising model? We reasoned that at a minimum to be considered a “pmmising” approach to linkage we should have empirical evidence that the project had: (a) served adequate numbers of people who had both substance abuse treatment and health care needs, and (b) actually delivered both kinds of services to clients who needed them. These criteria represent in essence an implementation check. Examination of the abstraction data allowed us to “sift” the 21 grantees and identify those who served more clients and provided more “linked” services to those who needed them. Because the abstraction data suggested that more than nine of the grantees were both identifying adequate numbers of people with substance abuse treatment and health care needs and actually delivering services to them, we decided to balance a variety of other factors in making our selection of nine to be studied in detail. These included our desire to include: A mix of the different types of linkage models; Projects that address both sides of the linkage coin i.e., those that are identifying substance-abusing clients in primary care and those that are providing primary care in the context of drug treatment; A mix of different drug treatment modalities (e.g., methadone maintenance, outpatient drug-free, residential); Examples of projects delivering or placing a heavy emphasis on “other” services, such as street outreach or case management; and At least some programs serving special populations, such as women or high-risk youth. Based on these and other factors, we selected the nine demonstration grantees that would be studied in more detail. Selection of these specific projects did not mean that we believed them to be necessarily the nine “best”
387
projects in the demonstration, according to whatever criteria could be used to compare programs (e.g., number of clients served, number retained in treatment or successfully completing treatment, number of clients who received a particular service or set of services). Rather, we viewed them as the set that best tit the objectives for the case study phase of the evaluation, which was to describe a variety of “promising models.” Case Study Methods. We opted to develop the detailed descriptions of the “promising models” using the case study method (Yin, 1984). The case study is a flexible methodology that is well-suited to the assessment of diverse projects that are being implemented in varying, uncontrolled, and changing environments since no assumption is made about uniformity across projects (Merrian, 1988; Shadish, Cook, & Leviton, 1991). As a result, the case study method has become increasingly popular with investigators attempting to address complex research questions that are not amenable to study via randomized experiments or quasi-experimental designs (Smith & Glass,1987). The Linkage Demonstration projects had characteristics that are best accommodated in case studies, including: (a) multiple data sources that provide both quantitative and qualitative data; (b) wide variation in site characteristics and in the type and availability of data across projects; (c) wide diversity in grantees’ local contexts, resource availability, and subsequent speed of implementation; and (d) changes in projects’ structure, content, and context over time. The case studies were based on two major sources of information: (a) applications, progress reports, and other documentation submitted by grantees during the demonstration, and (b) interviews and observations made by the research team during site visits. In preparation for the site visits, we first reviewed the site’s documentation. Then we contacted project directors to inform them of the kinds of information we would be collecting through the site visits and the categories of people we would plan to talk with during the visits. Project directors were informed that the major questions that the site visits and case studies were designed to address included: What was the problem or need that the demonstration project was designed to address? What were the primary care and drug treatment systems like before the grant? What changes were brought about by the demonstration project (i.e, how grant funds were spent and the changes that resulted)? How is linkage achieved at a particular project? How does the “linked” system that has been created relate to its “environment” (e.g., other service providers, marketing, etc.)? and
388 l
WILLIAM E. SCHLENGER et al.
What important lessons have been learned that would be valuable to others who are considering ways to improve linkage in their own communities?
We also specified the kinds of people we wanted to interview during the site visits. These included: the project director (himself or herself); other relevant administrators; clinical directors or medical directors of affiliated drug treatment and primary care providers; and direct service providers, including drug treatment counselors, primary care providers (e.g., physicians, nurse practitioners), social workers, and case managers. We also specified our interest in talking with individuals who were not directly involved with the linkage project, whom we referred to as “stakeholders,” but who nevertheless would be reasonably knowledgable about the aims of the linkage demonstration and what the project had accomplished, particularly with regard to impacts on the broader service delivery system; such individuals would potentially provide a more detached, “outsider’s” perspective on the linkage project. The kinds of “stakeholders” we were interested in talking with included: regulatory officials or other drug treatment and health care system administrators (e.g., state and local substance abuse or health officials who were familiar with the project); other local drug treatment, primary care, and social service providers not directly affiliated with the linkage project; and representatives of community or patient advocacy groups. Interviews were conducted using topic guides developed for each of the different types of people we would be interviewing (e.g., project director, medical director, service provider, stakeholder) and covering the issues with which we believed they would be most familiar. The topic guides served not as a set of structured questions to ask to interviewees, but rather were designed to provide a basic framework or starting point for addressing a specific set of issues during an interview. Beyond that, interviewers were expected to exercise considerable flexibility and discretion in following up on specific points that interviewees made (or based on observations) and in asking relevant probes. Thus, interviews during the site visits were designed to be semistructured. Each case study resulted in a detailed report describing a specific demonstration project. These reports included: a summary of the drug abuse problem in the grantee’s community; a description of the drug treatment and primary care systems in the grantee’s community before the demonstration began; the intervention that the grantee actually implemented; the “linked’ service system; gaps in the linked system; barriers encountered by the grantee; and lessons learned. The nine case studies, together with a cross-site analysis that focused on the broader lessons learned and the implications for the future, were combined into a “case
book” on linkage (Schlenger et al., 1992b). One of the important lessons learned was the observation that linkage could be achieved in a variety of ways. Colocation of services (a “one-stop shopping” approach) was an important element in many projects’ approach to linkage. Colocation seemed to have at least two kinds of benefits. First, it facilitated client access to services by reducing geographic and bureaucratic barriers to service utilization. Second, it facilitated interaction among providers of different types, which seemed to improve communication about specific cases, awareness of service system resources, and development of a shared understanding among providers of the nature of substance abuse and its relationship to other aspects of clients’ lives. Another important lesson learned through the evaluation was that case management was an important element in virtually all linkage strategies. Although the roles and functions of case managers varied considerably among the projects, most sites adopted a case management role that might be best described as case manager/counselor. That is, in addition to identifying client needs, locating and arranging for appropriate services, and monitoring client progress, most projects expanded the case manager’s role to include the provision of supportive counseling. The rationale for doing so typically involved the perceived need for a continuous, supportive human presence across the client’s treatment episode. A third important lesson learned was that linkage with mental health services was an important element not addressed directly in this demonstration. In many sites it was clear that mental health providers, particularly those whose training and experience may have encompassed both the primary care and substance abuse arenas (e.g., psychiatrists, psychiatric nurses), played a pivotal role in the establishment of linkages. They seemed to have done so by facilitating the development of a common understanding on the part of providers of the nature of clients’ problems and the appropriate treatment of them. The evaluation also noted some implications for future directions. These included the important issue of finding resources to pay for linked services when the demonstration program ends, the importance of planning for continuing care for chronic problems like substance abuse, and the need to develop new, brief interventions for substance users identified via screening in primary care settings.
DISCUSSION In this paper we have described one approach to the evaluation of services demonstration programs when certain limitations are present. Because substantial resources are currently being devoted to services demonstration programs, it is important to maximize the information that results from them. That is, the lessons that are learned in
Evaluating Services Demonstrations these demonstrations should be accessible not only to the participating grantees, but also to the broader scientific and professional communities. Well-designed, cross-site evaluations of such programs are an important vehicle through which lessons can be learned and communicated to others. There are a number of benefits of the multistage approach we have described. First, it is empirically based - i.e., it is designed to provide empirical answers to the evaluation questions. Second, it uses multiple sources of data and uses each source for purposes to which it is best suited. Third, it takes account of the constraints associated with the demonstration’s design and acknowledges the limitations that those constraints place on the definitiveness of the conclusions that can be drawn. Fourth, it provides information that is responsive to the objectives of the demonstration. In doing so, it represents an attempt to produce the most useful information possible under the existing constraints. All of these benefits, in our view, fit under the rubric: “Match the design to the objectives and the constraints.” From the practical perspective, our experience with this evaluation has also yielded some lessons. First, definition of a core data set, and involvement of the grantees in the definition process at the beginning of the demonstration, was clearly beneficial. In doing so we focused on data that were already being collected for other purposes (e.g., clinical records) but that met the evaluation’s information needs as well. Involvement of grantees in the process helped assure that the desired information was in fact in the records when abstracters went looking for it. Second, the multistage design allowed us to use evaluation resources efficiently by collecting progressively more detailed data from progressively fewer projects. That is, at the first stage, we collected basic descriptive data from all participating grantees and used those data to select a subset that would be studied in more detail. This “sifting” process is efficient in its use of resources, reduces the burden of data collection on grantees, and is effective in targeting the evaluation resources on the projects from whose experiences the most can be learned. The notion of sifting to find promising approaches differs from the more traditional approach to multisite evaluations (MSEs), which emphasize the pooling of data across sites to increase external validity and statistical power (Turpin & Sinacore, 1991) or the use of metaanalysis to assess average effects (Cordray, 1993). All of these approaches are appropriate under certain circumstances. The choice between them should be based on the evaluation’s objectives and the design limitations present in the demonstration, though the sifting approach can be applied and may be appropriate even in the context of randomized experiments. It is important to note that not all services demonstration programs have the limitations that we have identi-
389
fied in this paper. Many of SAMHSA’s demonstration programs include requirements for random assignment, are testing a common intervention, and require collection of common data. Unfortunately, these requirements are not included in all programs, in spite of the fact that the enabling legislation for many services demonstrations includes a mandate for process and outcome evaluation. Federal agencies should strive to eliminate the kinds of design characteristics described in this paper that limit the power of demonstration evaluations and instead include design features that provide the basis for more definitive outcome evaluations. Doing so will increase substantially the return on investment from demonstrations to practitioners, to science, and ultimately to clients and their families. Dennis (1994) describes the scientific basis for a variety of approaches to improving the implementation of randomized field experiments, and Dennis and Boruch (in press) provide a less technical list of practical suggestions for implementing stronger designs. Our focus in this paper has been on what evaluations of demonstration programs can tell us about the interventions that the grantees are testing, and we have identified some design characteristics that limit the conclusions that can be drawn from such evaluations. We have not focused, however, on another important goal of evaluation, which is to improve program implementation and management. That is, there is clearly an important role for evaluative information in the management of demonstration projects, particularly where new interventions are being implemented. In our view, however, such management functions are best addressed in the “local” evaluations of each specific project, and questions about the interventions are best addressed in a “national” (i.e., cross-site) evaluation conducted by an independent evaluator. To summarize, we have argued that substantial resources are currently being allocated to services demonstration programs, and therefore it is important to maximize what is learned through such programs. The fundamental goal of service demonstrations is to provide grantees with the opportunity to implement novel approaches to improving service delivery in real-world settings. Consequently, we have recommended an implementation-focused approach to the evaluation of these programs, one that takes account of the important scientific limitations often found in them. This approach is grounded in the view that demonstration programs represent one step in the broader process through which interventions are conceived, tested for feasibility, tested for efficacy, and ultimately disseminated into routine practice. Recognition of design limitations when they are present, and of the role that services demonstrations can play in the broader process, allows the evaluator to match the evaluation design to both the limitations and the objectives of the demonstration, thereby improving both the validity and ultimate utility of its findings.
390
WILLIAM E. SCHLENGER et al.
REFERENCES
RAY, B.A. (1988). Linking to primary care. Rockville, Drug Abuse, and Mental Health Administration.
ASSOCIATIONOF STATEAND TERRlTORIALHEALTHOFFICERS (1988). lnrravenous drug use and HIV transmission: Recommendations by fhe ASTHO Comminee on HIV. Washington, DC: Author.
ROSSI, P.H., & FREEMAN, H.E. (1985). approach (3rd ed.). Beverly Hills, CA: Sage.
Evaluation:
MD: Alcohol,
A systematic
pro-
SCHEIRER, M.A. (1986). Managing innovation: A framework for measuring implementation. In J.S. Wholey, M.S. Abramson, & C. Bellavita, (Eds.), Performance and credibility. Lexington, MA: Lexington Books.
CORDRAY, D.S. (1993). Strengthening causal interpretations of nonexperimental data: The role of meta-analysis. New Directions for Program Evaluarion, 60,59-96.
SCHEIRER, M.A., & REZMOVIC E.L. (1983). Measuring the degree of program implementation: A methodological review. Evaluation Review, 7, 599-633.
DENNIS, M.L. (1990). Assessing the validity of randomized iments: An example from drug abuse treatment research. Review, 14, 347-373.
SCHEIRER, M.A. (1994). Designing and using process evaluation. In J.S. Wholey, HP. Hatry, & K.E. Newcomber (Eds.), Ha&book of prach’calpmgram evaluation (pp. 40-68). San Francisco, CA: Jossey-Bass.
BREKKE, J.S. (1987). The model-guided method of monitoring gram implementation. Evaluation Review, II, 281-300.
field experEvaluation
DENNIS, M.L. (1994). Ethical and practical randomized field expcriments. In J.S. Wholey, H. Hatry, & K. Newcomer (I%.), Ha&book of practical program evaluation (pp. 155-197). San Francisco: JosseyBass. DENNIS, M.L., & BORUCH, randomized field experiments: Program Evaluarion. FEDERAL Documents,
R.F. (in press). Improving the quality of Tricks of the trade. New Directions for
BUDGET. (1993). Washington, U.S. Government Printing Office.
DC: Superintendent
of
GRAY, W. (1986). A role for evaluators for helping new programs succeed. In J.S. Wholey, M.S. Abramson, & C. Bellavita. (Eds.). Performance and credibility. Lexington, MA: Lexington Books. HALL, G.E., & LOUCKS, S.F. (1977). A developmental determing whether the treatment is actually implemented. Educational Research Journal, 14.263-276.
model for American
HAVERKOS, H.W., & LANGE, W.R. (1990). Serious infections other than immunodeficiency virus among intravenous drug abusers. Journal of Infecrious Diseases, 161, 894-902. HEALTH RESOURCES AND SERVICES ADMINISTRATION CONSULTANT WORKGROUP (1989). Report of the HRSA Consultant Workgroup on the role of primary care in the prevention and treatment of drug abuse and AIDS. Rockville, MD: Health Resources and Services Administration. LEBOW, J. (1982). Models for evaluating services at community mental health centers. Hospital and Community Psychiatry, 33, 1010-1014.
SCHLENGER, W.E., DENNIS, M.L., & MAGRUGER-HABIB, K. (1990). Design for a national evaluation of model demonstrations for linking primary care to drug abuse services as a step in reducing the spread of AIDS (Alcohol, Drug Abuse, and Mental Health Administration Contract No. 282-88-001913). Research Triangle Park, NC: Research Triangle Institute. SCHLENGER, W.E., KROUTIL, L.A., ROLAND, E.J., & DENNIS, M.L. (1992a). National evaluation of models for linking drug abuse treatment and primary care: Descriptive report of phase one findings (National Institute on Drug Abuse Contract No. 283~90-COOl). Research Triangle Park, NC: Research Triangle Institute. SCHLENGER, W.E., KROUTIL, L.A., ROLAND, E.J., ETHERIDGE, R.M., COX, J., CADDELL, J.M., WOODS, M.G., FOUNTAIN, D.L., BRINDIS, C., & BLACHMAN, L. (1992b). National evaluation of models for linking drug abuse tmatment and primary care: Linking drug tmarment and primarycare -A casebook (National Institute on Drug Abuse Contract No. 283--l). Research Triangle Park, NC: Research Triangle Institute. SHADISH W.R., COOK T.D., & LEVITON, L.C. (1991). Foundations of program evaluation. Newbury Park, CA: Sage. SHAPIRO, S., SKINNER, E. KESSLER, L., VON KORFF, M.. GERMAN, I?, TISCHLER, F., LEAF, P., BENHAM, L., COTTLER, L., & REGIER, D. (1984).Utilization of health and mental health services: Three epidemiologic catchment area sites. Archives of General Psychiatry, 41,971-978. SMITH M.L., & GLASS G.V. (1987). Research and evaluation in education and the social sciences. Englewood Cliffs, NJ: Prentice-Hall.
MADAUS G.F., SCRIVEN M., & STUFFLEBEAM D.L. (1983). Evaluation models. Boston, MA: Kluwer-Nigh Publishing.
TURNER, C.F., MILLER, H.G., & MOSES, L.E. (Eds.). (1989). AIDS: Sexual behavior and intravenous drug use. Washington, DC: National Academy Press.
MATHENY, S.C. (1989). Health S ervices and Resources Administration AIDS activities. Rockville, MD: Health Resources and Services Administration.
TURPIN, R.S., & SINACORE, J.M. (1991). Multiple sites in evaluation research: A survey of organizational and methodological issues. New Directions for Program Evaluation, 50, 5-18.
MERRIAN, S.B. (1988). Case study research in education: A qualitative approach San Francisco, CA: Jossey-Bass.
YIN, R.K. (1984). Care srua’y research: Design and methods. London: Sage.