V. Empirical validity of indirect mental health needs-assessment models in Colorado

V. Empirical validity of indirect mental health needs-assessment models in Colorado

and Program Planning, Vol. 15, pp. 181-194, in the USA. All rights reserved. Evaluation Printed 1992 Copyright 0 0149-7189/92 $5.00 + .OO 1992 P...

2MB Sizes 2 Downloads 67 Views

and Program Planning, Vol. 15, pp. 181-194, in the USA. All rights reserved.

Evaluation

Printed

1992

Copyright

0

0149-7189/92 $5.00 + .OO 1992 Pergamon Press Ltd.

V. EMPIRICAL VALIDITY OF INDIRECT MENTAL HEALTH NEEDS-ASSESSMENT MODELS IN COLORADO

DAN L. TWEED Department

of Psychiatry,

JAMES Mental

Health

Systems

of Psychology,

of Survey and Evaluation

Center

Project,

University

of Denver

A. KIRKPATRICK

DAVID

Bureau

Medical

A. CIARLO

Evaluation

LEE

Department

Duke University

College of William

L.

Research,

and Mary

SHERN

New York State Office

of Mental

Health

ABSTRACT This article presents the results of empirical validation studies of six statistical models currently available for indirectly estimating the need for alcohol, drug abuse, and mental health (ADM) services across a large geographic area such as a state. Initial analyses show that the basic assumptions regarding geographic variations in need which are shared by all such models can be substantiated by data from the Colorado Social Health Survey (CSHS) of diagnosable disorders, dysfunction in everyday living, and demoralization. Next, the performance of the original versions of the six models on three statistical tests of the match between model predictions and surveyed ADM service needs is presented. The tests show how well each model’s predictions of need prevalence in 48 small subareas of Colorado compared with surveyed needs in those same areas. Results indicate that in their original form, none of the models were accurate predictors of surveyed need for ADM services in Colorado, but that they did show predictive potential in terms of their epidemiologic relationships with surveyed need. Following “optimization” of the model equations by adjusting their parameters to best fit the CSHS data, repeated analyses indicated that all models except one were considerably more accurate predictors than a flat-rate model based on the assumption of equal need across all subareas. A modified two-variable linear regression model developed by Slem was the best all-around performer on the evaluations of prediction accuracy, followed by the Synthetic Estimation model. Two “experimental” regression models containing the better predictors found among the original models (poverty and divorce rates) also performed very well. The demonstrated performance of these optimized and experimental models provides additional evidence of the utility of selected models as services planning and budgeting tool for states. These models are further evaluated with respect to frequently designated high-priority “7arget”populations in the next article.

This research was supported by National Institute of Mental Health Grant No. MH37698, Principal Investigator; James A. Ciarlo. funding was provided by the National Institute of Drug Abuse and the Hunt Alternatives Fund, Denver, Colorado. Requests for reprints should be sent to Dr. James A. Ciarlo, Mental Health Systems Evaluation Project, Department of Psychology, of Denver, Denver, CO 80208. 181

Additional University

DAN L. TWEED et al.

182

This article presents concepts and procedures for cvaluating the empirical performance of several proposed statistical models for indirectly estimating need for alcohol, drug abuse, and mental health (ADM) services. The models are described in detail in Article IV of this series. Model evaluations are based upon comparisons of the predictions of each model for 48 different Colorado subareas with the results of directly surveyed ADM needs for those same areas derived from the Colorado Social Health Survey (CSHS) of diagnosable ADM disorders, everyday dysfunction, and demoralization. These evaluations thereby also address the scientific viability of the indirect needs-assessment enterprise as a whole, since this type of needs assessment presumes that certain statistical indicators related to need for ADM services (e.g., poverty rates, unemployment rates, divorce or separation rates, and so forth) can be used to draw inferences about the prevalence of mental health and substance abuse problems in a small geographic area. Thus, if al/ tested models performed poorly in predicting need across the 48 demographically varying subareas of Colorado surveyed in this study, the viability of indirect assessment of need for services by means of these small area social indicators (or “risk factors” in person-related terminology) could come into serious question. On the other hand, satisfactory performance of even one model would considerably strengthen the scientific and epidemiologic underpinnings of indirect needs-assessment efforts generally. This information has both scientific and practical applications, as many state mental health service systems are already implementing such models as part of their services planning process without prior model validation studies. Next, a framework is outlined for empirical evalua-

BASIC

ASSUMPTIONS

tion of performance of the predictor models against the observed CSHS prevalence rates in the Colorado subareas. This section emphasizes the importance of careful selection of the validating criteria in such model evaluation efforts, since the “match” between predicted and surveyed need can vary dramatically as a function of the particular dependent (or validating) need variable chosen. Three statistical procedures-tests of bias, absolute prediction errors, and correlation-are described to evaluate the match between model predictions and observed need prevalence. In the latter part of the article, evaluation results in terms of these statistical tests are presented for the six proposed models, both in their original parametric forms and in “optimized” forms incorporating ADM need prevalence parameters derived from the CSHS survey data described earlier in this series. In addition, two “experimental” models developed by this project are described and then evaluated in the same manner. Finally, some initial conclusions are drawn regarding each model’s potential utility as a tool in the indirect needs-assessment process. The empirical validation results are integrated with the findings from the a priori review of model characteristics reported in the preceding article, and these combined findings are used to strengthen the case for needs assessment by means of one of the better models (two of the optimized models and the two experimental ones). These four promising models are then further evaluated in Article VI with respect to more complex ADM need categories-specifically, need groups defined using multiple criteria and often termed “target” populations in state ADM service plans and budget documents.

OF MODEL-BASED

All model-based needs-estimation techniques rely upon two basic assumptions that are independent of the structural form of any specific model. These are: 1. There is real (nonrandom) variation in the true prevalence of need for services across geographic areas; 2. This variation is related to the characteristics of the people inhabiting these areas and/or the conditions under which they live. These assumptions can be evaluated empirically using the CSHS data. To the extent that they are supported, one can be optimistic about the predictive potential of such indirect, model-based needs estimation strategies, regardless of the specific performance of any particular model(s). Testing the Validity of The First Assumption The two assumptions are anaI~ically and empiricaIly related. The first assumption-that lrue need levels will,

NEEDS

ESTIMATION

in general, vary among areas-is critical. If valid, it is incompatible with a simple uniform- or “flat-rate” statistical model positing that the true prevalence of need for ADM services is essentially constant across areas. Evaluating the empirical validity of the flat-rate model is thus a strategic place to begin an assessnlent of modelbased needs-estimation strategies. If the flat-rate model is consistent with surveyed need data, the assumption of real variation in need across areas comes into question. There would thus be little point to testing more complex models that depend upon geographic variation in true need for services. An initial approach to evaluating the flat-rate model involves examining the variance of the estimated need prevalence rates across the 48 surveyed CSHS subareas. This observed between-areas variation in estimated prevalence rates has two components. The first component is attributable to whatever variation exists in the underlying true prevalence rates for the areas. Under the

Validity

COMPARISON

of Needs Assessment Models

183

TABLE 1 OF THE OBSERVED BETWEEN-AREAS VARIANCE OF PREVALENCE RATES TO THE VARIANCE EXPECTED FROM SAMPLING ERROR ONLY

Need Variable

State-Wide Prevalence Ratea

Observed Between-Areas Varianceb

Expected Between-Areas VarianceC

Diagnosable disorder, 1 -month Dysfunction, 1 -month Demoralization, 1 -week

16.34% 11.08 11.05

37.86% 27.07 24.48

16.36% 12.00 11.90

aState-weighted rates (see Article II of this series, section on “Weighting”). 11.41%, respectively.

Unweighted

Observed/Expected Ratio 2.31 2.26 2.06

48-area mean rates were 17.13%,

1 1.64%,

and

n

variance was calculated in the standard way as c (P, - P)*/(n - I), where P, is the estimated prevalence ,=I rate for the ith subarea, p is the unweighted average of the 48 subarea rates, and n = 48. CThe expected between-areas variance was defined as (p * (1 - p)ll 00) * (de,,), where p is the estimated prevalence rate (weighted) for the state as a whole, 100 is the expected sample size for each area, and d,,, is the design effect associated with the cluster-sampling survey design within subareas (see Article II). The design effect was estimated as the average ratio of 48 subarea variances calculated by Taylor Series Linearization to the area variances that would have been calculated for simple random sampling. These area-level design effects for diagnosis, dysfunction, and demoralization were 1,197, 1.218, and 1.21 1, respectively. bThe observed between-areas

flat-rate assumption, of course, this component should be zero. The second component is due to sampling error that exists whether the flat-rate model is true or not. Hence, if the flat-rate model were correct and the true prevalence rate were constant across the 48 subareas, the between-areas variability displayed by the need rates would consist solely of unreliable statistical “noise.” A first step, then, is to compare the observed betweenareas variance to the amount of variance expected if sampling variability were the only factor operating. The results of this comparison are shown in Table 1. For each of the three primary need variables assessed in this research, the ratio of observed between-areas variation to expected random variation under a flat-rate, sampling-variability-only statistical model (adjusted for cluster-sampling “design effects”) is substantially greater than 1.O. Indeed, for all three need variables assessed, there was more than twice as much between-areas variation as would be expected if sampling error were the only source of variation. Hence, the flat-rate samplingvariability-only model does not appear to be valid, thereby substantiating the first assumption.

A Test of Validity of Both Assumptions The second assumption-that variation in area need is related to the characteristics of the people residing there and/or to their environment -can be evaluated in a way that also provides another test of the first, If the flat-rate model was tenable, the observed between-area variation would be statistical “noise” that would not correlate with other independently measured phenomena. However, if the subarea rates did correlate with epidemiologically meaningful factors, such as poverty, divorce rates, and unemployment, then the area variations in need are not random and the flat-rate model must be incorrect. Indeed, such correlations would not only warrant dismissal of the flat-rate model as both incorrect and inappropriate for mental health services planning (for at least this set of Colorado subareas), but would also demonstrate the potential of the modelbased needs assessment endeavor regardless of whether or not that potential had yet been realized in any existing model. Table 2 shows statistical data for the 48 Colorado subareas that are relevant to this question. Using data

TABLE 2 DISTRIBUTION OF VALUES FOR SELECTED HDPS SOCIAL INDICATORS ACROSS 48 COLORADO SUBAREAS

Social Indicator Median income Population in poverty (%) Ethnic minority population (%) Low occupational status-males (%) High educational status-completed 4 years high school (%) Husband/wife households (%)

Average Value

Standard Deviation

Observed Minimum

$17,756 12.9 21.8

$7,028 10.3 25.3

$5,970 1.7 1.9

32.9

13.9

9.6

75.6 57.2

14.3 17.3

30.3 11.4

Observed Maximum $39,080 49.7 96.4 74.2 95.7 87.8

DAN L. TWEED et al.

184

TABLE 3 CORRELATIONS SELECTED Social

indicator

and HDPS

Median

income

Persons

below

Minority

population

Husband/wife aAll correlations

SOCIAL

Variable

PREVALENCE

INDICATORS

FROM

Number

status households except

(MNSOOOSS)

those

ADM SERVICE ACROSS

Disorder

NEED MEASURES

48 COLORADO Everyday

46

02)

superscripted

are statistically

-.61

Demoralization -

40

.44

.32

.37

53

.48

.26a

.50

44

-.26a

-.53

-

45

-.65

-.54

-

41

significant

from the Health Demographic Profile System (HDPS) of 1980 decennial census variables, values are

drawn

presented for several social indicators selected to represent conditions associated with the risk of mental health problems (Goldsmith et al., 1984). The table shows that the amount of variation among these Colorado subareas is considerable - median household income ranges from under ~6,~O to over $39,0~, the proportion of the population in pover?_y ranges from almost none to nearly half the residents, and the ethnic minority percentage ranges from under 2% to over 96oio. Similar sizeable variations are apparent for the other three indicators listed. Given this diversity among subareas, it would be reasonable to anticipate a higher prevalence of mental health and substance abuse service needs in areas where the typical resident experienced considerable social and economic stress, compared to areas where most persons were far more personally and socially advantaged. Given the substantial variability in subarea social in-

A FRAMEWORK

AND

SUBAREAS

Dysfunction

.40

(MNS00064) (MNSOOI

FOR THREE

THE HDPS SYSTEM

-

(MNS00029)

(MNS00008) status

RATES

Diagnosable

(MNS00022) poverty

Low occupational High educational

BETWEEN

FOR EMPIRICAL NEEDS-ASSESSMENT

Some Preliminary Considerations Choice ofNeed ‘Standard”. The empirical evaluation of a statistical model for estimating needs is an assessment of the degree to which the model’s predictions fit the empirical prevalence data estimated by direct survey. Since the survey estimates become the “standard” against which the models are compared, the fit of a model could depend heavily upon the choice of that standard. A model developed or calibrated” to predict to one type or component of need for services may not fare well when its predictive performance is measured against another. ---~ IIn this report only linear correlations are examined. While the possible existence of nonlinear relationships is an important consideration, many such nonlinear relationships are rather well approximated by linear forms. Further, the power of a 48-cases data set to discriminate subtle departures from linearity is relatively low. Testing for iinear relationships in these data, therefore, is a reasonable starting point.

at or below

the p = ‘05

level

dicators observed across the 48 Colorado subareas, the correlations shown in Table 3 demonstrate clearly the existence of covariation between these indicators and the CSHS surveyed need prevalence rates for these areas. Sixteen of the 18 correlations are statistically significant and reflect moderately strong relationships. Further, the directionality of each relationship agrees with that expected from previous epidemiologic and social-area research findings. Thus the between-areas variance in need clearly does not behave like statistical noise, and on these grounds the flat-rate model again appears untenable. Equally important, these results strongly support the feasibility of indirect estimation of need using such social indicators. Of course, how successfully current needs-estimation models actually capitalize upon such correlations’ and how well they can predict to surveyed need across Colorado subareas is another matter, as will be seen later in the presentation of modeling results.

EVALUATION MODELS

OF INDIRECT

In Article II of this series it was noted that diagnosable disorders, cases of everyday dysfunction, and cases of demoralization were largely independent criteria of need for ADM services. Their overall prevalence and between-areas variation in prevalence also differed. Hence, a model might predict quite well to one of these measures, but poorty to another. Further, if these three caseness measures are considered separate co}~ponenrs of need for ADM services, various combinations of these could also be used as validating criteria for testing the predictor models. For example, if need for ser.‘To “calibrate” a model, in thiq context, simply means to select values for the parameters included in the model. If, for example, a model specifies that need is a linear function of some indicator X, as in: Need rate = A + BX, with linear parameters A and B, then the calibration of the model simply involves choosing numerical values for the A and B parameters. This can be done in Feveral ways. including “guessrimation” on the basis of epidemiologic research or statistical optimization against a given ret of need data.

Validity

of Needs Assessment Models

vices were defined as caseness with respect to any of the components, average CSHS rates of need for ADM services would be nearly double that for the highest prevalence single caseness measure, and the between-areas variance associated with these different validation criteria would differ dramatically as well. Thus it is highly likely that the fit of the model predictions with surveyed need will vary according to the specific criterion chosen as the “standard” of need; a single model usually cannot be all things to all possible need standards. In Article VI, various composite indices of need for ADM services will be used to define key “target groups” for receipt of ADM services in planning efforts. Since all will be defined in terms of one or more of the three basic CSHS caseness measures (diagnosis, dysfunction, and demoralization), each model will be tested here for fit against these three basic ADM need components. Those models that perform well with two or all three basic measures should also tend to perform well against the multiple-component or composite need indices. Additional tests of the ability of several high-performing models to predict to various composite need “standards” will also be presented there.

185

to be examined for each model was the simple mean of the residuals associated with any model, or average error. This is defined as

62=

2 e;/n,

(2)

i=l

where n = 48, the number of areas in the CSHS study. This statistic is generally interpreted as a measure of the “bias” of the model. If the bias is positive (the directsurvey estimates tend to be higher than the model predictions), the model tends to underpredict the need rate in the subareas; if the bias is negative, then the model tends to overpredict the need rate. The ideal is a model whose average residual is zero-that is, an “unbiased” model. The next statistic evaluated was the average absolute error, which can be understood as a summary measure of the “distance” of the predicted from the observed need estimates across the 48 subareas. It is an index of the average size of a model’s prediction errors (ignoring sign) across subareas. This statistic is defined as

Measures of Model “Fit”. To some extent, the choice of need standard also affects the assessments of empirical fit by different measures. Some measures of fit will be very sensitive to the specific need standard chosen, while others will be less sensitive. Yet both types of measures of fit play an important role in the model evaluation process, particularly since the interest of states in predicting to different aspects of ADM services need can vary enormously. The emphasis here, given the absence of a commonly accepted definition of need for ADM services, will be on both types of measures of fit. This should provide the most general assessment of the model’s potential across alternative definitions of need for ADM services. Criteria for Evaluation of Model Performance In this research, model performance was evaluated in terms of three statistical procedures. Each of these involves an analysis of the errors in prediction that arise when the model is used with reference to a particular standard of need. Such errors provide critical information about the manner in which need predictions deviate from the direct survey-based (“observed”) need estimates. These deviations are termed residuals and are defined as e;=

Yj-

Y;,

(1)

where ei is the residual for the ith subarea under investigation, Y; is the corresponding direct-survey need estimate, and Yi’ is the predicted need value (or indirect need estimate) generated by a model. The first statistic

IFI = i

i=I

tell/n,

(3)

where e is the average absolute error, 1ei 1 is the deviation of the predicted need value from the observed need value without regard to sign, and n = 48, the number of subareas studied. For any given choice of need standard, the model with the smaller average absolute error is the better-fitting model. While each of the above statistics is important in evaluating the fit of any model against a given standard of need, they are particularly sensitive to the metric in which predicted need and surveyed need are defined. Hence, the same model will appear to perform quite differently against different need standards if the latter have different means and variances-even when the differing need standards are only scale transformations of the same basic need variable. Assume, for instance, that the rate of diagnosable disorders is chosen as the standard of need for ADM services. This rate can be expressed in terms of different bases (rate per 100 persons, per 1,000, per 10,000, and so forth). The need rate underlying these alternative expressions is the same, and the different expressions are exactly proportional to one another. Yet, applying the above statistics to evaluate a given model’s estimates against these different expressions of the rate of diagnosable disorders would produce sharply different “pictures” of how well the model fits the data. Further, even if the metrics are carefully matched to avoid such problems, models calibrated against epidemiologic data in other states or regions may appear to perform poorly if there is a sizeable dif-

DAN L. TWEED et al.

186

ference in the overall prevalence of diagnosable disorders between the previously studied region and that found in the areas under study. The prime candidate for a measure of fit less sensitive to epidemiologic or scale differences in the need standard is the product-moment correlation between the predicted need values produced by the models and the direct survey-based need estimates. For example, the correlation between a model’s prediction and the rate of diagnosable disorders would be the same regardless of whether the rate was expressed per 100 persons or per 10,000 persons. Accordingly, the correlation coefficient is an important tool for evaluating a model’s performance against a number of alternative definitions or standards of need for ADM services. High correlation of a model’s predictions with directly surveyed need rates indicates that the model has good potential as a device for indirect estimation. On the other hand, low correlations indicate there are problems with model accuracy that may not be amenable to any calibration adjustment. Moreover, this is true regardless of performance using other measures of fit. For example, if

EVALUATING

PROPOSED

INDIRECT

The six models presented in Article IV represent different methodologies and variations in the social indicators employed to indirectly assess ADM need. Each is evaluated and compared on prediction accuracy to a flatrate model using the preceding statistical techniques. The first model listed is the National Institute of Mental Health’s (NIMH) Rank-by-Race model (Sobel, Rosen, & Goldsmith, 1978). It contains no prevalencerate parameters and is thus capable of generating only relative ADM services need information for subareas. The Grosser model (198 1) contains prevalence-rate parameters for several age categories, which are then modified by six social indicators combined into a single adjustment factor. Closely related to it is the Prevafence- Variability model (Ciarlo, 1981), a linear equation model using parameters based upon variations in surveyed prevalence of psychiatric symptomatology across small subareas of two Florida counties (Schwab, Bell, Warheit, & Schwab, 1979). This model uses the same predictors as the Grosser model, since it was developed as an alternative to the latter. The Yarvis/EdwaFds model (1980, 1984) is a five-indicator categorical model that assigns one of three fixed prevalence rates to an area on the basis of the sum of its rankings on the five social indicators. The Sfem model is a two-indicator linear regression equation originally derived from studies of services utilization across eastern urban census tracts (Slem, 1975). The last model evaluated here is a Synthetic Estimation procedure proposed by Holzer, Jackson, and Tweed (1981). It employs cross-tabulations of

model predictions have a high correlation with surveyed need rates, but a large average absolute error, it indicates that the primary problem is one of model calibration; its parameters have probably been inaccurately specified and can be improved through optimization techniques. When the correlation is low, however, the outlook is less optimistic for improvement via parametric calibration, although the form of the model determines whether optimization might have correlational advantages as well. The general principle adopted in this research is that a model which performs well by the correlational criterion across several need standards would be preferred to models doing well against just a single standard. Under the logic of convergent validity, a model that shows a high correlation with need for ADM services independent of how need was defined would appear to be the more valid and generalizable model.3 Given a more generalizable model, its improvement or optimization would then require (1) adopting one or more need standards for various applications, and (2) selecting optimal parameters for the model relative to those standards.

NEEDS-ASSESSMENT

MODELS

four categorical individual-level demographic variables as predictors of need in a series of subareas, using previous epidemiologic data on the need prevalence rate in each subcategory to generate estimates in the new areas. All the models evaluated here are described and critiqued in Article IV of this series. The evaluations of model performance involved two steps. First, the three evaluation procedures described above were applied to each of the models using their original parameter values. Each set of original model predictions for 48 Colorado subareas were compared with survey “caseness” rates for diagnosable disorders, everyday dysfunction, and demoralization in the same areas. Next, using appropriate statistical techniques, the parameters for each model were “optimized”that is, revised to minimize prediction errors with respect to each of the three primary CSHS indices of ADM need. The optimized models were then again compared to surveyed need measures and to each other.

‘The three basic ADM need measures were intercorrelated more highly at the subarea level (TS = .72, .56, and .68) than they were at the individual level (see the correlations presented in the Article II). There, the relative individual-level independence of these caseness measures (even within outpatient and inpatient groups) indicated that each of these three were important in defining “need for ADM services.” Here, the strong area-level intercorrelations of the three caseness prevalence rates also suggest that any epidemiologic variables that are associated with one will likely be associated with the others, even though they are assessing different symptom clusters or problems. Hence, models that predict well to one area-level need prevalence measure may also predict at least to some extent to the others.

Validity

of Needs Assessment

187

Models

by states) was about as accurate as the best-performing model here. Data in the next panel of Table 4 show the average absolute deviations of model predictions from surveyed values. These figures are sensitive to multiple factors, including bias, a constricted range of model predictions relative to the criterion, and the basic correlation (that is, epidemiologic relationship) of model predictors with surveyed need. The Larger prediction errors again occurred for the Yarvis/Edwards and the Slem models. Note that even the best-performing models did not usually perform better than a 10% flat-rate assumption; the only exception was the Synthetic Estimation model with respect to diagnosable disorder. As will be seen below, however, the major problem detected by both the bias and absolute-error tests appears to be miscalibration of basically sound models that show considerable potential for indirect needs estimation. The data in the third panel of Table 4 are most important since they are least sensitive to errors of mis-

Model “Fit” Using Original Parameters Table 4 shows the results of the statistical procedures used to evaluate the “fit” of each original model vis-avis the three primary need variables. The data are presented in three panels, the first assessing the statistical bias of the different models. The figures show that all the models missed the mark considerably in predicting the rate of diag~~~a~Ie disorder across the 48 subareas; the Yarvis/Edwards procedure substantially overestimated this prevalence rate (about 8% on average), and all the others understated it, from 7% on average for the Prevalence-Variability model to nearly 15% for the Slem regression model. The latter finding was not surprising, since the Slem model had been calibrated to predict to services utilization data across census tracts, which usually averages only a few percent of area residents. Most models performed better in predicting everyday dysfunction and demoralization; however, even an assumed flat prevalence rate of 10% (commonly used as an overall mental health needs estimate

TABLE 4 PREDICTIVE PERFORMANCE OF SIX INDIRECT NEEDS-ASSESSMENT MODELS (ORIGINAL PARAMETERS) AND A FLAT-RATE MODEL VERSUS THREE BASIC CSHS NEED MEASURES (A’ = 48) Need Measure Statistical Performance

Index

Diagnosable Disorder

Model

Bias (difference between surveyed and predicted mean prevalence rates of I 7.13%, -3 NIMH Rank-by-Race 9.30% Grosser Prevalence-Variability 7.00 YarvisiEdwards -8.15 14.90 Slem Synthetic EstimationiHANES-II C&S-D 2.24b 10% flat-rate assumption 7.13 Average absolute deviations (from CSHS-estimated NIMH Rank-by-Race Grosser Prevalence-Variability YarvislEdwards Slem Synthetic Estimation/HANES-II CES-5 10% flat-rate assumption Product-foment correlafion with need measure NlMH Rank-by-Race Grosser Prevalence-Variability YarvisiEdwards Slem Synthetic EstimationiHanes-II CES-D 10% flat-rate assumption

Everyday 1 1.64%,

Dysfunction

and 1 1.41%, _a 3.81% 1.51 -13.64 9.41 3.24b

1.64

Demoralization

respectively) --a -3.58% 1.28 -13.87 9.19 -3.47

1.41

rates) --a 9.53% 7.84 9.11 14.90 4.54b 7.78 .48 .52 .50 .28 .64 .56b n/a

-a

-a

4.69%

4.49%

4.18 13.64 9.45 4.58b 4.22 .64c .61C -60’ .41c .58’ .51 b.C n/a

4.49 13.96

9.19 4.75 3.84 .40 .40 .44 .38 .46 .42 n/a

aNot available; model provides only subarea rank ordering for ADM service needs rather than quantitative prevalence estimates. bTechnically, the HANES-II survey-based CES-D parameters may not be appropriate for modeling diagnosis and everyday dysfunction. National survey-based parameters for modeling these two ADM need measures were not available for this study. “Correlations of model predictions with the alternate CSHS measure of everyday dysfunction (above the 90th percentile the general population sample on two or more domain dysfunction scales-see Article Ill) were very similar (.64, .67, .66, .46, .49, and .45, from column top downward).

DAN L. TWEED et al.

188

calibration and can most clearly depict the underlying epidemiologic relationships between model social indicators and the surveyed rates of need for ADM services. Here predictive performance was favorable for all the models, from the best-performing (Slem regression) to the worst-performing (Yarvis/Edwards); all predicted to the surveyed need prevalence rates at better-than-chance levels. The correlations for diagnosabfe disorder ranged from .64 for the Slem Model to -28 for the Yarvis/Edwards model. For everyday dys~~~ction the correlations were generally somewhat stronger, ranging from .64 for the NIMH Rank-by-Race model to -41 for Yarvis/Edwards. In contrast, however, the correlations for demoralization were lower and fell into a narrow range, from .46 for the Slem model to .38 for the Yarvis/ Edwards model.4 At this point, if one had to choose a model to work with in planning a state’s services with respect to geographic subarea differences, one would face a difficult choice because the models differ in performance according to the particular evaluation technique involved. The Slem model correlated quite well with all three need measures, but showed the poorest performance on the bias and absolute-error tests. The Grosser and PrevalenceVariability models also did quite well in correlational performance. Overall, however, model performances in terms of bias and average-error statistics were poor. Fortunately, this dilemma can be eased by recalibrating model parameters against the Colorado survey data in order to eliminate the bias and to optimize their predictive potential. Mode1 “Fit” Using Optimized Parameters For each indirect needs-assessment model, an attempt was made to select optimum parameters with respect to each of the three main need measures. For example, the original Grosser model became three separate models one each for diagnosable disorder, dysfunction, and demoralization.5 The “fit” statistics discussed below are for the optimum algebraic model for each specific measure of need for ADM services, without changing any of the model’s component social indicators or its functional form.

~prirni~atio~ procedures. The nature of the optimization procedure

varied by model.

For the Grosser,

Prev-

40ne must consider here the issue of sampling variability among these correlations; most of them would not be significantly different from one another by statistical test. The chief exception is the difference between the Slem and Yarvis/Edwards models for the diagnosable disorders variable, which is statistically significant. ‘The optimal calibration of a particular model (that is, the choice of parameter values that optimize the fit of the model with respect to the CSHS direct-survey need estimates) must vary according to the standard of need for ADM services that is used. Thus, until a widelyaccepted, single definition of need for ADM services is available, one can only optimize a model’s parameters with regard to some particular measure of need or need component.

alence-Variability, and Slem models (all of which are essentially linear regression models) optimal parameters were selected using ordinary least-squares regression techniques. For the Grosser model, the age-specific prevalence rate assumptions in the model were first replaced with similar estimates derived from the C5’Hs data set. The Z-score scaling factor was then chosen by regressing the dependent need variable being modeled on the product of the Z-score and the initial rate, constraining the intercept to zero to ahow initial assumptions to set subarea rates. This process was repeated for each dependent need measure vis-a-vis the Grosser model. Similarly, each dependent measure was regressed on the composite Z-score variable of the PrevalenceVariability model. The resultant intercepts and slope parameters were adopted as the new parameters for these model equations. For the Slem model, each dependent need measure was regressed on the two independent variables and the resulting parameters adopted. For the Synthetic-Estimation CES-D model, the procedure involved replacing the set of HANES-II CES-D caseness rates for the age/sex/ethnic group/marital status categories with the corresponding rates from the entire CSHS sample. These new figures were used to derive alternative predictions for each of the 48 subareas. Similarly, synthetic estimates for diagnosable disorder and dysfunction prevalence rates were calculated since CSHS-based epidemiologic data for these other need components were available. Optimizing the categorical Yarvis/Edwards model was more challenging. Ultimately, the decision was made that a reasonable technique would be to replace the three a priori categorical need rates with values more consistent with the CSHS data. The 48 areas were divided into three groupings on each of the basic ADM need variablesthe 10 highest ranking areas, the 10 lowest ranking areas, and the 28 remaining areas - giving consideration to the shape of the observed need measure distributions across the 48 areas in selecting the numbers of subareas falling into the different groups. The mean need caseness rates were calculated within each of these groups, and these rates were adopted as the new predicted values for high-, intermediate-, and low-need categories of the Yarvis/Edwards model. The NIMH Rank-by-Race model could not be optimized in the same sense as the other models, since it contains no prevalence-rate parameters and thus cannot be optimized with respect to bias or average absolute prediction errors.

Results of Model Optimization.

The results of the model-optimization process are shown in Table 5, with the structure of the table paralleling that of the original model evaluations in Table 4. As can be expected, essentially all of the bias for each model disappeared with the new parameters; none of the differences in observed and

Validity of Needs Assessment Models

189

TABLE 5 STATISTICAL PERFORMANCE OF SIX INDIRECT NEEDS-ASSESSMENT MODELS (OPTIMIZED PARAMETERS) VERSUS THREE BASIC CSHS NEED MEASURES (N = 48) Need Measure Statistical Performance

Index

Model

Diagnosable Disorder

Everyday

Dysfunction

Demoralization

Bias (difference between surveyed and predicted mean prevalence rates) --a NIMH Rank-by-Race 0.74% Grosser Prevalence-Variability .oo .03 Yarvis/Edwards .oo Slem .43 Synthetic EstimationiCSHS .OO Optimal flat-rate assumption

-a

_a

0.58% -00 .06 .oo .20 -00

0.32% .oo -.06 .oo -.02 .oo

Average absolute deviations (from CSHS-estimated NIMH Rank-by-Race Grosser Prevalence-Variability YarvisiEdwards Slem Synthetic EstimationlCSHS Optimal flat-rate assumption

-a

-0

_a

4.22% 4.15 5.30 3.67 4.33 4.96

3.23% 3.27 4.03 3.07 3.27 4.07

3.32% 3.28 3.74 2.91 3.34 3.87

rates)

Product-moment correlation with need measure’ NIMH Rank-by-Race Grosser Prevalence-Variability YarvisiEdwards Slem Synthetic EstimationiCSHS Optimal flat-rate assumption

_b

-b

_b

.49 .50

.61 .60 .41 .62 .61 n/a

.44 .44 .38 .59 .48 n/a

.28 .65 .56 n/a

aNot available; model provides only subarea rank ordering for ADM service needs, rather than generating quantita~ve prevalence estimates. bCorrelations for this nonoptimized model not shown. ‘Correlations for the alternate CSHS measure of dysfunction were .68, .66, .46, .56, and .55 for the Grosser through Synthetic Estimation models, respectively

predicted means exceeded 0.75%. More importantly, the average absolute errors were also cut sharply with optimization, even though they were still well above zero. The new errors ranged in size from about 3.7% to 5.3% for diagnosis, from about 3.1% to 4.0% for dysfunction, and from 2.9% to 3.7% for demoralization. While these error sizes indicate that the models’ predictions were still far from perfect, almost all are smaller than the absolute error figures for optimum flat-rate models which used the CSHS sample means as the flat rate for each need variable. The best-performing model (Slem) showed reductions in average prediction error of about 26070for diagnosis, 25% for dysfunction, and 25% for demoralization over those for optimum flatrate assumptions. Perhaps the most interesting result of optimization was the improvement in performance for the Slem model, particularly with respect to demoralization. Optimized versions of this model correlated between .59 (demoralization) and .65 (diagnosis) with the need measures; it was the best all-around performer in the group. Also of interest was the increase in predictive accuracy

for the Synthetic Estimation model. Its correlation with C&S-D-based demoralization improved to .48 when modified to incorporate CSHS parameters, and its correlation with diagnosis and dysfunction increased to .56 and ‘61, respectively. The relative status of the other models changed only slightly or not at all, since they are mathematic~ly single-variable functional forms that offer no opportunity for adjustment of variable weights.” No comparative correlations of the optimal flat-rate models with surveyed need are available, since the zero variation in the “predicted” rate precludes calculation of this statistic. A striking visual example of how optimization impacts the fit of a model to a given criterion measure is

6The Prevalence-Variability model, for example, is a linear function of one independent variable; all possible linear expressions of that one variable will share a common correlation with any third variable. Hence, if new parameters are chosen to optimize its performance, both new and old models will have exactly the same correlation with any given need standard. The Slem model, however, involves two independent variables; thus it may be possible to find alternative parameter values that would correlate more highly with a given need standard.

190

DAN L. TWEED et al.

Diagnosabte

Disorders

Rote

Figure 1. Predicted versus observed need in 48 Colorado subareas. Stem original regression model estimates with surveyed diagnosable disorders.

shown in the contrast between Figures 1 and 2, depicting scatter diagrams of Slem model predictions with diagnosable disorder before and after parameter optimization. Note that the dashed lines along the diagonals of the plots indicate the points at which the predicted value and the observed values would be preciseIy equal. All points below the line represent underpredictions and all points above the line represent overpredictions. In Figure 1, the original Slem model predictions fall far below the diagonal and appear as almost a flat line, indicating very poor prediction of observed need for ADM services. It is nearly impossible to detect visually any correlation between the model predictions and diagnosis rates, even though this model had the highest correlation with diagnosis. However, after optimization as shown by the scatter diagram in Figure 2, the predictionversus-observation points fall impressively along the reference diagonal. Optimization results in differential parameter values for the model predictor variables according to the need criterion or standard being used.’ For example, the optimized weights of the two predictors for diagnosable disorder are roughly .7 and . 1, while for everyday dysfunction they shift to .8 and .Ol. This change reduces

‘Parameters for the original Slem model are: Constant (Bo), 0.249; One-Person Households (B,), ,038; and Divorced or Separated Males (&), .115. For the optimized model predicting diagnosable disorder, these parameters change to 7.553, ,114,and .738, respectively. For the model predicting every&y dysfitnclion, they become 3.991, .009, and .803, respectively. Finally, for that predicting demorafization, they become 5.187, --.096, and .921, respectively.

the influence of the second predictor to nearly zero, and also represents a sharp relative predictor weights shift (the ratio of the two weights increases more than 100 times from that for the diagnosis model). These regression-based parameter changes allow maximization of the predictive potential of the model with respect to each of the three need variables. Initial conclusions Regarding Optimized Models Most of the discrepancies between average predicted need rates and surveyed prevalence for the different models were removed through model optimization. Each optimized model, therefore, is essentially unbiased, and most models also have average absolute errors smaller than those for even optimum flat-rate models. Note that the average absolute errors vary inversely with the correlations between predicted and observed subarea prevalence rates- the higher the correlation, the lower the average prediction error, as would be expected. In the preceding article in this series, anaiysis of model characteristics showed that every indirect needsassessment model under consideration had some a priori problems with respect to its functional form, component indicators, compatibility with previous epidemiological research findings, or ease of implementation. Given the empirical evaluation results described in this article, it is possible to combine both analytic and the empirical findings into some initial conclusions regarding the attractiveness of specific models for implementation by states or regions. These will be modified somewhat in Article VI on the basis of performance

Validity of

191

Needs Assessment Models

350

,I

30,/’ I’ 25,’

G

0

It: ‘E 20-

q oo

*7 0"

00

-15-

0

090

0

0

y”

0

no,,/

$y’o

i E=J IO-

0 I’

I’

a

0

,/’

q

,’

,’

,’

,’

I’

0

0 q

0

$,/

q

I’

I’

I’

,’

,’

I’

,D’,’

8

I’

I’

,’

cl

B

0

Cl 0

Cl

/-

0

,/ I’

5-

0

I’

I’

I’

,’

I’

I’

,’

,’

.’

I’

I’

/

,’

I

I

I

I

I

,

,

I

0

5

10

15

20

25

30

35

Diagnosable

Disorders

Rate

Figure 2. Predicted versus observed need in 48 Colorado subareas. Slem optimized regression model estimates with surveyed diagnosable disorders.

data vis-a-vis multiple-component “target groups” important to state services planning efforts. At this point, however, some judgments on the specific models evaluated seem warranted and are given below, in ascending order of overall analytic and empirical appeal. The NZMH Rank-by-Race Model. Despite a reasonably solid set of correlations of the model’s need rankings for subareas with the three measures of surveyed need in the same areas, there is very little to recommend widespread use of this model in its present form. It cannot provide quantitative ADM need prevalence estimates for planning services in subareas, and no prediction-error estimates are available for comparison of its performance to other models. In addition, its correlational performance is equalled or surpassed by other models which do provide the advantage of numerical need prevalence estimates. After experimentally revising the functional form of this eight-indicator ranking model to a simple linear-regression format, its correlations with the three main need measures increased to .69, .60, and .70,.respectively. However, several predictors in this model have nonsignificant parameters, and others do not predict in the direction expected from prior epidemiologic research. This model must await further conceptual and empirical work; as it stands, it is not recommended. The Yarvis/Edwards Three-Category Model. The Yarvis/Edwards model, while developed from a basically sound logical and empirical-research framework, appears to be too crude in its current format to be of

much use. Contributing to the original model’s relatively large prediction errors and lower-than-average correlations with the three CSHS need measures was the fact that only 16 of the 48 subareas were categorized as other than “intermediate-need” areas, thus losing substantial discrimination among the areas by the five social indicators comprising the model. While this was improved somewhat in the optimized version to 10 high-, 28 intermediate-, and 10 low-need subareas, the loss of information inherent in the categorizing system still caused its correlational performance to be the poorest among the six models tested. The lack of continuous, rather than categorical, prevalence-determining parameters is partially offset by the empirical percentile-scoring procedure employed in determining the sum-of-ranks score that is then used to assign one of the three prevalence rate categories to a subarea. However, the problem is that ranks, like Z-scores, are normed to particular distributions, and what may be the 75th percentile in one application could be the 25th in another. The Grosser and Prevalence- Variability Models. Empirically, both the optimized Grosser and PrevalenceVariability models performed reasonably well with respect to all three basic measures of need for ADM services, particularly everyday dysfunction. Their nearly identical performances also suggest that the added complexity of the Grosser model (specifically, the age-group adjustment factor) adds little in the way of predictive power. Hence when optimized, the Prevalence-

192

DAN L. TWEED

Variability model would be preferred for its greater simplicity. Yet the performance of these two models was not sufficiently impressive to warrant overlooking their serious a priori shortcomings described in Article IV (primarily the standardization of the social indicators comprising the models). The Grosser model, while very promising in its conceptual approach to the problem of need estimation, would require structural modification before it could be recommended for widespread use. Its current drawback is that it fixes the variance of its predictions according to the values of its parametric assumptions rather than according to the variance of the social indicators used to drive the model. The same conclusion can be drawn concerning the PrevalenceVariability model since it shares this drawback.

The Synthetic Estimation Model. This model was one of the less problematic in the analytic review of models presented in Article IV and is particularly promising because of its potential for using the individually based diagnostic data provided by the five-site NIMH Epidemiologic Catchment Area (ECA) surveys (Regier, Myers, Robins, Hough, & Locke, 1984). This model, linked with ECA-based diagnosis prevalence parameters, has been used to model need rates for all Texas counties (Holzer, Swanson, Ganju, Goldsmith, & Jackson, 1989), and other states are experimenting with the technique. Importantly, the CSHS data set will provide an independent data base for cross-validating the ECAbased models in the near future. In the current study, however, only the vertical synthetic estimation model using C&S’-D parameters derived from the national Health and Nutrition Examination Survey (HANES-II) was tested against the three ADM need measures. This model’s estimates matched the surveyed means of the three need variables more closely than most other original-parameter models, and after optimization vis-a-vis CSHS state-level parameters its correlations with surveyed rates were second only to the Slem model. On the negative side, this model is fairly complex and cumbersome to calculate (it contains 72 variable-plus-parameter pairs), and it is currently restricted to using only the most basic demographic predictors because of the requirement for census-based cross-tabulations of all variabies in the subareas where need is to be estimated. Accordingly, there seems to be little justification for incurring this extra implementation effort when the optimized versions do not generate equally good matches to surveyed need as the simpler Slem model.

et al.

a-vis surveyed need rates. As noted, its original parameters generated need estimates extremely wide of the mark because of its original calibration against ADM services-utilization rates instead of ADM need prevalence rates. Nonetheless, its potential was already apparent in the high correlations of these estimates with surveyed need; they were either the highest or close to the highest obtained for any model. Optimization against CSHS parameters further improved performance with respect to both correlations and absolute prediction errors, as has been described in detail in the previous section. Of the existing models tested, the Slem model appears to be the most accurate. Development and Performance of Two Experimental Models While the primary objective of this research was to evaluate already-proposed indirect needs assessment models, another was to utilize the CSHS data set to explore and develop new models that would have even greater predictive potential than those now available for use. This tack was particularly appealing since the correlations obtained between the CSHS data and key social indicators (poverty, divorce/separation, minority status, and so forth) were both sizeable and in the directions anticipated from the epidemiologic literature. This provided considerable assurance that use of the survey data to develop new models would not merely capitalize upon chance associations between ADM need measures and important social indicators. Indeed, many of these key indicators had already been incorporated into one or more of the proposed models evaluated here. Accordingly, new models were sought that combined selected indicators from the better-performing original models, thus avoiding a haphazard search for whichever indicators might produce the highest correlations with surveyed need. Such new models would hopefully have at a minimum the following features: (1) they would require relatively few social indicators, (2) they would produce need estimates whose mean and variance were a function of the distributions of the subarea social indicators themselves, and (3) they would have a functional form that ensured appropriate need estimates regardless of possibte extreme values of the indicators used in the model. The simplest type of model with this last property is the logistic regression model, which states that the odds of ADM need caseness for any area can be estimated by the following equation: Odds;

= Exp(Ho + B,X,,

+ BzA’l, +

. + B,:Xn,),

The Slem Mode/. On an a priori basis, the Slem linearregression model appeared to be very attractive in terms of simplicity and ease of implementation, assuming that its initial calibration for predicting services utilization instead of need could be corrected. It also proved to be best-performiilg model in the empirical evaluation vis-

where Odds: are the estimated caseness odds for subarea i given the values of the social indicators chosen, Exp( ) defines the natural exponential function, and x’r, through Xh, are the selected social indicators. BO through B,+are the basic parameters for the equation.

Validity of Needs Assessment Models

it generates very slightly larger average absolute prediction errors (3.79%, 3.12%, and 2.92% for diagnosis, dysfunction, and demoralization, respectively), and very similar correlations (r = .65, .63, and .56 with the three measures, respectively). Its advantages over the Slem model are (1) the logistic model appears to have better face and content validity (it contains the well-known poverty variable), and (2) the mathematical form of the model ensures that its predictions will be within the permissible range when used with social-indicator values outside the ranges found in this sample of 48 Colorado subareas. Even more important, however, was the fact that the new model not only performed about as well as the best of the optimized models evaluated, but it outperformed all others vis-a-vis combinations of the three basic variables as well as with important subcategories such as “severe” diagnoses and “chronic mental illness.” These additional ADM need categories are perhaps even more important criteria for model validation than the three basic need measures, as will be discussed in Article VI of this series. Table 6 also shows that the simpler linear regression version of the D.U. model performs equally well or even a bit better with everyday dysfunction than the logistic version. Good performance, of course, would be expected given the use of linear regression to sift the better models from the various possibilities. This alternative to the logistic model is also presented in Article VI. Like other models developed from only a single data set, these new models ultimately must be cross-validated on other data sets similar to the CSHS. As described above, however, the fact that the social indicators employed were drawn from already-proposed models and are supported by existing epidemiologic research makes it unlikely that their performance here reflects solely capitalization upon chance associations that would not be found in replication studies.

The estimated rate of caseness for the ith area can then be determined as Rate; = (Odds:/(Odds:

193

+ 1)).

The use of the logistic regression format represents a methodological advance over the linear-regression procedure in that predicted need prevalence rates for Colorado or other states, regardless of the specific social-indicator values involved, will fall within the range of 0% to 100% of the subarea population. With linear models, on the other hand, extreme social-indicator values can produce nonmeaningful need estimates outside this range if the original equations have been derived using a narrower range of values than those found in the state to which the models are being applied. In selecting indicators for the model, it was decided to restrict the model to include only two variables. Previous analyses had suggested that more than two variables were probably unnecessary, given the limit on the number of subareas in the data set (48). This number was unlikely to be sufficient for evaluating the significance of relatively small regression effects when more than two variables were in the equation. Accordingly, all permissible two-variable linear regression models were examined for each of the three basic ADM need measures from the CSHS. The results indicated that two variables-percentage of population in poverty (HDPS MNS00029) and percentage of divorced males (HDPS MNSOOO86)-consistently performed well. That these indicators also reflect two fundamental social area dimensions ~So~ioeconomic Level or Rank” and “Household Composition”) suggested good content validity and enhanced their value as links between social area characteristics and epidemiologic findings. Finally, optimal values for each of these parameters were selected using the SAS NONLIN procedure. Table 6 shows average absolute prediction errors and correlations for this model (termed the Denver University Logistic Regression model) with respect to each of the three ADM need measures. The data show that this model performs quite comparably to the Slem model;

Additional

Considerations

Regarding

Model Utility

Analytic strengths and empirical validity are two major characteristics of a model that would make it potentially

TABLE 6 STATISTICAL PERFORMANCE OF TWO EXPERIMENTAL INDIRECT NEEDS-ASSESSMENT VERSUS THREE BASIC CSHS NEED MEASURES (N = 48)

MODELS

Need Measure Statistical Performance

Index

Model

Average absolute deviations from surveyed rates (%) D.U. logistic regression D.U. linear regression Froducf-moment correlation w~h need measure D.U. logistic regression D.U. linear regression

Diagnosable Disorder

Everyday

Dysfunction

Demoralization

3.79 3.73

3.12 2.95

2.92 2.9f

.65 .66

.63 .66

.56 .56

DAN L. TWEED et al.

194

very useful to a state considering implementation of one or more indirect needs-assessment models. These two issues have been addressed in this and the preceding artitle (IV). A number of other issues would also impact on a needs-assessment model implementation decision, such as a state’s ADM service priorities, the predictability of designated “target groups” of needy persons, the confidence legislators may have in scientific and techSUMMARY

AND

In this study, six statistical models for indirectly estimating the need for ADM services have been empirically evaluated and compared both to each other and to a “default” flat-rate assumption of no differences in need prevalence across a state’s subareas, The results have enormous implications for the indirect needs estimation enterprise as a whole, both in terms of its testable, model-independent assumptions and in terms of the performance of specific models. The findings clearly suggest that the flat- or uniform-rate model is an inappropriate assumption for estimating the need for ADM services in smaller substate localities. The basic assumptions of indirect needs estimation-existence of substantial geographic variations in need and their linkage to variables - have epidemiologic or “social-indicator” been corroborated via direct survey of the ADM service needs of a large sample of the population of an entire Western state. Second, two previously proposed models have been found to be valid, useful, and recommendable statistical tools for conducting indirect ADM needs assessment at the state or regional levels. After optimization vis-a-vis

nical data of this type, and the commitment state leaders have with respect to allocating services as fairly and equitably as possible. These issues are addressed at some length in the following article (VI), including quantifying and predicting to high-priority state ADM “target” groups with the CSHS data, and presentation of final recommendations for model selection and implementation for services planning purposes. CONCLUSIONS the CSHS survey data, the simplest and most accurate of the original models selected for study was a two-variable linear regression model developed by Slem. Another very worthwhiie model is the Synthetic Estimation procedure originally developed outside the field of mental health, but now refined and articulated by a number of ADM services researchers with support from the National Institute of Mental Health. Either of these two needsassessment procedures appears to merit implementation by state ADM service authorities wishing to provide increasingly scarce service-system resources to areas within their jurisdictions where they are most needed. Third, two new models developed by the Denver University research team (logistic regression and linear regression versions of a two-indicator model) also appear to be anaIytically strong and empirically valid. Additional performance data on these models described in Article VI of this series will show that they merit first consideration by state ADM services leaders and personnel charged with the responsibility of allocating public resources and funds to meet the ADM needs of its population.

REFERENCES CIARLO,

orado-A University

J.A. (1981). Statistical needs a.we.ysment models for Coltechnical report). Denver, CO: of Denver, Mental Health Systems Evaluation Project.

workingpaper(Unpublished

GOLDSMITH, H.F., JACKSON, D.J., DOENHOEFER, S., JOHNSON, W., TWEED, D.L., STILES, D., BARBANO, J.P., & WARHEIT, G. f 1984). The Heaith Demographic Profile System’s inventoq~ of small area social indicators (National Institute of Mental Health Series BN No. 4, DHHS Publication No. ADM 84-1354). Washington, DC: U.S. Government Printing Office.

R.C. (1981, March). A model to est~rnate~o~ufat~o~ in need of mental heaith services by catchment area. Presentation at the

GROSSER,

Third National Conference on Needs Assessment man Service Systems, Louisville, KY.

in Health

toe (Eds.), Community care of the chronicall.v mentally ill. Austin, TX: University of Texas and Hogg Foundation for Mental Health. REGIER, D.A.. MYERS, J.K., ROBINS, L.N., HOUGH, R.L.,& LOCKE, B.Z. (1984). The NIMH Epidemiologic Catchment Area program. Archives of General Psychiatry, 41(10), 934-941. SCHWAB, J.J., BELL, R.A., WARHEIT, G.J., & SCHWAB, R. (1979). Social order and mental illness. New York: Brunner-MaleI. SLEM, C.M. (1975). Mental health needs assessment: Prediction of census tract utilization patterns using the .Mental Health Demographic Profire System. Unpublished Ph.D. dissertation, Department of Psychology,

Wayne State University,

SOBEL,

S.B., ROSEN,

Detroit.

and HUB.M., &GOLDSMITH,

H.F. (1978, July).

Anatysis of the needs assessments in the 1976 state plans (Mental HOLZER, C.E., JACKSON, D.J.,&TWEED, D.L. (1981). Horizontal synthetic estimation: A social-area demographic estimation procedure for use in mental health needs assessment. Evaluation and

Program Planning, 4(l), 29-34. HOLZER, C.C., SWANSON, J.W., CANJU, V.K., GOLDSMITH, H.F., & JACKSON, D.J. (1989). Estimates of need for mental health services in Texas counties. In C.M. Bojean, M.T. Coleman, & I. Is-

Health Study Center Laboratory Paper No. 45). Rockville, tional Institute of Mental Health. YARVIS,

R.M.,

& EDWARDS,

MD: Na-

D.W. (1980). Planning: Thedesign CA: Pyramid Systems.

of men&l health programs. Sacramento, YARVIS,

R.M., EDWARDS,

D.W., & YARVIS, M. (1984, October).

Measuring impairment: Synthetic estimate vs. survey (Unpublished manuscript).

Davis, CA: Pyramid

systems.