Relationship of Domain-Specific Measures of Health to Perceived Overall Health among Older Subjects

Relationship of Domain-Specific Measures of Health to Perceived Overall Health among Older Subjects

J Clin Epidemiol Vol. 51, No. 1, pp. 11–18, 1998 Copyright  1998 Elsevier Science Inc. All rights reserved. 0895-4356/98/$19.00 PII S0895-4356(97)00...

198KB Sizes 0 Downloads 10 Views

J Clin Epidemiol Vol. 51, No. 1, pp. 11–18, 1998 Copyright  1998 Elsevier Science Inc. All rights reserved.

0895-4356/98/$19.00 PII S0895-4356(97)00234-5

Relationship of Domain-Specific Measures of Health to Perceived Overall Health among Older Subjects Gertrudis I. J. M. Kempen,1,* Ida Miedema,1 Geertrudis A. M. van den Bos,2 and Johan Ormel 3 1

Northern Centre for Healthcare Research, School of Medicine, University of Groningen, The Netherlands; 2 Institute of Social Medicine, Academic Medical Center/University of Amsterdam, The Netherlands; and 3 Department of Psychiatry, University of Groningen, The Netherlands ABSTRACT. The associations between nine domain-specific measures of health (e.g., depressive symptoms, psychological distress, mental health, physical functioning, role functioning, social functioning, bodily pain, somatic symptoms, and chronic medical morbidity) and a single-item measure for perceived overall health were studied in an extensive community-based sample of elderly persons (n 5 5279). The results showed that: (1) the discriminative power of perceived overall health compared to domain-specific measures of health was moderate to large only at the fair/poor end of the perceived overall health spectrum; (2) a single-item measure of perceived overall health did not cover domain-specific measurements of health since only 41.8% of the variance in perceived overall health was explained by all domain-specific measures; and (3) the affective domains of functioning (psychological distress, mental health) were weakly related to perceived overall health. Bodily pain, chronic medical morbidity and, to a lesser extent, physical functioning were more strongly related to perceived overall health. These results were fairly consistent for men and women and for three age groups. We conclude that a global, single-item measure of perceived health and domain-specific health measures are not exchangeable in evaluation, survey, or epidemiological research. j clin epidemiol 51;1:11–18, 1998.  1998 Elsevier Science Inc. KEY WORDS. Perceived overall health, self-rated health, domain-specific measures of health, aged, health status

INTRODUCTION Health-related quality of life measures are increasingly used to assess the burden of disease, particularly among the elderly. These measures may consist of a single-item question or multi-item (sub) scales measuring several domains, such as physical, social, and affective functioning. Single-item questions of self-rated health are often included in national population surveys [1–3]. In these surveys, researchers want to include the measurement of health in relation to many social and health care characteristics (such as housing, income, leisure, transport, expenditure, care utilization, and health insurance), which must all be covered by a relatively brief interview [4]. Single-item measures are also a valuable tool for clinical caseworkers, health care professionals, or researchers who need to measure health status, but lack the time, skills, or funds to use multiple-item scales [5]. Furthermore, calculations of years of healthy living are often based on ratings of perceived health [6,7]. *

Address for correspondence: Dr. G. I. J. M. Kempen, Northern Centre for Healthcare Research, School of Medicine, University of Groningen, A. Deusinglaan 1, 9713 AV Groningen, The Netherlands. Accepted for publication on 30 September 1997.

Global, single-item measures of perceived overall health have been shown to be reliable and valid [5,8] and to predict mortality [9–14] and changes in functional status [15]. In addition, research in the 1980s clearly showed independent associations between perceived overall health and health care utilization among elderly persons [16–18]. However, other research showed that the correlation between singleitem health measures and health profiles or multi-domain indices was far from perfect [19–21] and that single-item measures were less precise, less reliable, and less valid than multi-item scales [4]. A study by Krause and Jay [22] revealed that self-rated health ratings were particularly affected by different frames of reference. Some study participants considered specific health problems when asked to rate their health, whereas others thought in terms of either general physical functioning or health behaviors. This study further showed that the specific referents used varied with age. In the present article, we study the associations between global health ratings and several domain-specific measures of health. More specifically we: test the power of the domain-specific health measures to explain differences in global ratings of health status, analyze the strength of

G. I. J. M. Kempen et al.

12

the associations between such global health ratings and domain-specific health measures, and study the relative, unique contributions of different domains of health to global health ratings. The overall health rating is assessed with one question which is frequently used in survey and epidemiological research: ‘‘In general would you say your health is excellent, very good, good, fair or poor?’’ The domain-specific measures of health assess levels of affective functioning, levels of behavioral dysfunctioning due to health problems, somatic symptoms, and chronic medical morbidity. METHODS The Groningen Longitudinal Aging Study is a populationbased prospective follow-up study of the determinants of the health-related quality of life of elderly people, particularly physical and social disability and well-being [23,24]. The main objective is to identify the psychosocial factors that influence the trajectory of quality of life, either independently or in interaction with disease-related factors. Sample The source population consisted of late-middle-aged and elderly non-institutionalized persons living in the northern part of the Netherlands. Subjects surveyed were participating in the Groningen Longitudinal Aging Study on functional status and need for care [23,24]. For the baseline sample of this study, we approached all persons aged 57 and older who were registered with 27 general practitioners participating in the Morbidity Registration Network Groningen (n 5 8723). In the Netherlands, approximately 99% of non-institutionalized elderly people are on a general practitioner’s panel. In writing, general practitioners asked eligible subjects for permission to provide their name and address to the Groningen Longitudinal Aging Study research team. A total of 1937 refused (22.2%). Of the remaining 6786, 1277 declined cooperation when contacted by the research team, and 152 had died or left the practice by the time contact was initiated. Another 78 subjects were excluded because of severe cognitive impairments at baseline (a Mini-Mental State Examination score of less than 17) [25]. Useful baseline data were available for 5279 subjects (62% response; 5279/(8723 2 152)); 2312 males (mean age 68.7; SD 7.8) and 2967 females (mean age 70.3; SD 8.3) were included. The baseline assessment was carried out in 1993 and consisted of an interview and a mail questionnaire. The subjects were interviewed by well-trained middle-aged women personally in their homes (n 5 4792) or by telephone (n 5 487). The interviewers were not acquainted with the interviewees from either a clinical or an administrative situation. The representativeness of the research sample relative to the source population depended on two potential biases: a

general practitioner selection bias and a response bias. In terms of selection bias, the general practitioners involved in the Morbidity Registration Network did not constitute a random sample from the population of general practitioners in the northern part of the Netherlands. However, it is unlikely that the selection of general practitioners caused any bias at patient level. The major difference between network and non-network general practitioners was the former’s involvement in medical training and research at the Department of Family Medicine of the University of Groningen. Owing to characteristics of the Dutch primary health care system, there is no reason to assume that the older patients of the network general practitioners differ from those of non-network general practitioners. The representativeness of the sample was studied in several ways. First, non-response bias was assessed on the basis of registered sociodemographic data. Non-response proved to be associated with age (34% non-response for 57- to 69year-olds, 42% for 70- to 84-year-olds and 67% for those 85 years of age or older) and to some extent with gender (37% for men, 41% for women). Second, using computerized health care utilization records which were available for 55% of the source population, baseline participants and non-responders were compared on four clusters of general practitioner registered morbidity: malignant neoplasms, ischemic heart disease and congestive heart failure, chronic respiratory disease, and chronic diseases of the locomotor apparatus. Multiple logistic regression analyses including age and gender showed that only malignant neoplasms had a significant effect (p 5 0.018) on response: the proportion of patients was higher among non-responders. However, such patients were not of particular interest in the present study. Third, we compared our data with data from the Dutch General Health Surveys. We found only marginal differences in the prevalences of disability and chronic disease between elderly subjects in the population and the participants in our baseline study. And although the prevalence of the health measures might be affected by the response rate, it is unlikely that the relationship between measures within the study group could be substantially biased by incomplete response rate. The results of the nonresponse analyses therefore gave no evidence of response bias relevant to the issues addressed by our study. Measures We selected domain-specific measures of health which have been widely used in survey and epidemiological research. The selected measures refer to the following domains: behavioral dysfunctioning, affective functioning, somatic symptoms, and chronic medical morbidity. Behavioral dysfunctioning caused by health problems was assessed with three measures (physical, social, and role functioning scales), which are all part of the MOS Short-Form General Health Survey [26]. Three measures reflected affective func-

Specific Measures vs. Perceived Overall Health

tioning (depressive symptoms, mental health, and psychological distress) and two measures reflected somatic symptoms (bodily pain intensity and somatic symptoms). The selected measures for affective functioning were the depressive symptoms subscale of the Hospital Anxiety and Depression Scale (HADS) [27,28], the SF-20/SF-36 subscale for mental health [26,29] and the 12-item version of the General Health Questionnaire [30]. Somatic symptoms were measured with the Symptom Check List-subscale for somatic symptoms (SCL-SOM) [31] and the SF-20/SF-36 subscale for bodily pain. A checklist of 19 chronic medical conditions was used to construct an index reflecting the number of chronic medical conditions. Subjects were asked whether they had had a specific chronic medical condition in the 12 months prior to the interview. The same procedure is used by the Dutch Central Bureau for Statistics in its periodic Health Survey Interviews [3]. Selected conditions were mostly those conditions for which an acceptable agreement was observed between self-reports and medical registrations [32]: asthma/chronic bronchitis, other lung diseases (e.g., pulmonary emphysema), heart condition, hypertension, (consequences of) stroke, diabetes mellitus, back problems during at least 3 months or slipped disc, rheumatoid arthritis or other joint complaints, migraine/chronic headache, serious dermatological disorders such as psoriasis and eczema, kidney disease, cancer, thyroid gland disorder, stomach ulcer, multiple sclerosis, other diseases of the nervous system such as Parkinson’s disease or epilepsy, liver disease or gallstones, prostate disease, and leg ulcer. In theory, scores on the index can range from zero (no conditions) to 19 (all conditions). To reduce potential reporting bias by patients, only ‘‘active’’ conditions were included, that is conditions for which a general practitioner or medical specialist was consulted, or for which medicines were used during the last 12 months prior to the interview. The SF subscales for mental health and physical functioning, and the questions on chronic medical conditions were part of the baseline personal interview. The other scales were all part of the baseline mail questionnaire. To assess perceived overall health, one self-rated health question was used: ‘‘In general would you say your health is excellent, very good, good, fair or poor?’’ This is the first item of both the SF-20 and SF-36 [26,29]. Data Analysis To examine the associations between the domain-specific measures of health and perceived overall health the following steps were taken. First, mean scores and standard deviations were calculated for all domain-specific measures at each of the five levels of perceived overall health. Differences in mean scores and linearity of association were statistically tested with analysis of variance. Second, the discriminative power of perceived overall health compared to the domain-specific measures was further tested with effect size

13

indices [33]. Although such indices are often used to assess responsiveness to clinically meaningful change over time, they are also very useful to analyze the magnitude of differences between independent observations. Effect sizes were computed for every two adjacent levels of perceived overall health: the differences in mean domain-specific health scores for the two levels of perceived overall health divided by the standard deviation of the two groups [34,35].1 The interpretation of the magnitude of the effect size is often based on Cohen’s rule-of-thumb: an effect size of 0.20 represents a small effect, 0.50 represents a medium effect, and 0.80 or greater represents a large effect [33]. Third, univariate associations between perceived overall health and the selected domain-specific measures were analyzed by means of Pearson correlation coefficients. Finally, multiple regression analyses were conducted with perceived overall health as the outcome variable and the domain-specific measures of health as predictors. The relative contributions of each health domain to the overall measure can be derived from the beta coefficients. The extent to which all domainspecific measures are covered by the overall measure can be derived from the total amount of explained variance. Previous research on the effects of gender and age on selfrated health showed inconsistent results (see for a review [36]). To study variations according to age and gender within our sample, correlation and regression analyses will be conducted separately for men and women and for three age groups (i.e., 57–64 years, 65–74 years, and 75 years and older) as well as for the total sample. RESULTS Table 1 presents the mean scores and standard deviations of all domain-specific health measures at each level of perceived overall health and for the total sample. Analyses of variance revealed significant differences between the domain-specific mean scores at each level of perceived overall health except for the differences between ‘‘excellent’’ and ‘‘very good’’ for HADS depressive symptoms, GHQ-12 psychological distress, SF physical functioning, SF role functioning, SF social functioning, and SCL somatic symptoms. In addition, the associations between perceived overall health and all selected domain-specific health measures were found to be linear. Table 2 shows the effect sizes for every two adjacent levels of perceived overall health in relation to each domainspecific health measure. Effect sizes reflect the magnitude of differences in domain-specific mean scores for each two adjacent levels. The effect sizes of both ‘‘excellent’’ versus ‘‘very good’’ 1

There is some controversy on how to compute the standard deviation for effect size indices [34]. As suggested by Garssen and Hornsveld [35] we used the pooled standard deviation for independent nonequal groups. This formula is: the square root of: [(N 1 2 1) sd21 1 (N 2 2 1) sd 22 /N 1 1 N 2 2 2].

G. I. J. M. Kempen et al.

14

TABLE 1. Mean scores (M) and standard deviations (SD) of nine domain-specific measures of health for each level of perceived

overall health Perceived overall health

Domain-specific health measures Affective functioning HADS depressive symptoms a GHQ-12 psychological distress a SF mental health b Behavioral dysfunctioning SF physical functioning b SF role functioning b SF social functioning b Somatic symptoms SF bodily pain a SCL somatic symptoms a Chronic medical morbidity Number of conditions a

Excellent (n 5 666)

Very good (n 5 705)

Good (n 5 2055)

Fair (n 5 1512)

Poor (n 5 117)

Total (n 5 5279)

M

SD

M

SD

M

SD

M

SD

M

SD

M

SD

2.4 7.4 87.3

2.4 2.7 13.2

2.8 8.3 82.5

2.6 2.9 14.3

3.9 9.5 77.5

3.0 3.9 16.6

5.8 12.4 68.4

3.7 5.7 20.3

9.3 17.7 52.4

4.5 7.2 23.5

4.3 10.1 76.0

3.5 4.9 18.9

86.6 95.4 95.0

18.8 19.2 16.3

81.9 92.0 93.4

21.4 24.9 15.6

73.1 82.3 86.0

25.7 35.3 21.5

49.3 48.9 65.5

28.4 46.1 26.5

22.5 10.3 35.6

25.0 26.0 28.9

67.8 73.7 80.9

29.6 41.3 25.7

8.3 13.7

19.2 2.4

15.0 14.6

23.0 3.0

26.9 16.0

28.1 3.9

49.1 20.1

29.7 6.0

68.4 26.5

30.2 8.0

30.4 17.0

31.1 5.3

.4

.7

.7

.8

1.0

1.0

1.9

1.3

2.6

1.6

1.2

1.2

Note: Due to missing values total number of participants do not add up to 5279. Analyses of variance revealed (1) significant differences between all domain-specific mean scores for each level of perceived overall health except for differences between ‘‘excellent’’ and ‘‘very good’’ for HADS depressive symptoms, GHQ-12 psychological distress, SF physical functioning, SF role functioning, SF social functioning, and SCL somatic symptoms (Scheffe test, p , 0.001), and (2) linear associations between perceived overall health and all domain-specific health measures (ANOVA, p , 0.001). a Higher scores indicate poorer health. b Higher scores indicate better health.

and ‘‘very good’’ versus ‘‘good’’ were relatively low (range: 0.10–0.44). This means that differences between ‘‘excellent’’, ‘‘very good,’’ and ‘‘good’’ health ratings were weakly reflected by the domain-specific health measures. In contrast, the effect sizes of ‘‘good’’ versus ‘‘fair’’ and ‘‘fair’’ versus ‘‘poor’’ were medium to very large (range: 0.50–1.12). This indicates that differences between ‘‘good,’’ ‘‘fair,’’ and ‘‘poor’’ health ratings are strongly reflected by all domainspecific health measures.

Table 3 shows the Pearson correlation coefficients between perceived overall health and the nine domainspecific measures of health. All correlation coefficients were statistically significant (p , 0.001). The correlations between the selected measures of affective functioning (mental health, depressive symptoms, and psychological distress) and perceived overall health were lower (highest was 0.40 in the total sample) compared to the correlations of physical functioning and bodily pain (0.48 in the total sample).

TABLE 2. Effect sizes of domain-specific measures of health on perceived overall health (n 5 5279)

Perceived overall health Excellent versus Very good Good Fair very good versus good versus fair versus poor Affective functioning HADS depressive symptoms GHQ-12 psychological distress SF mental health Behavioral dysfunctioning SF physical functioning SF role functioning SF social funtioning Somatic symptoms SF bodily pain SCL somatic symptoms Chronic medical morbidity Number of conditions

0.16 0.32 0.35

0.38 0.33 0.31

0.57 0.61 0.50

0.93 0.91 0.78

0.23 0.15 0.10

0.36 0.29 0.37

0.89 0.83 0.86

0.95 0.86 1.12

0.32 0.33

0.44 0.38

0.77 0.84

0.65 1.04

0.40

0.31

0.79

0.53

Note: Effect size of 0.20 is small, effect size of 0.50 is medium, effect size of 0.80 or above is large.

Specific Measures vs. Perceived Overall Health

15

TABLE 3. Pearson correlation coefficients between perceived overall health and nine domain-specific measures of health for

men and women, three age groups, and total sample Perceived overall health a

Domain-specific health measures

Men (n 5 2312)

Women (n 5 2967)

57–64 years (n 5 1834)

65–74 years (n 5 2083)

0.39 0.40 20.38

0.38 0.39 20.37

0.38 0.38 20.37

0.42 0.41 20.40

0.34 0.39 20.33

0.39 0.40 20.37

20.47 20.45 20.46

20.49 20.43 20.45

20.49 20.41 20.45

20.50 20.46 20.48

20.41 20.41 20.42

20.48 20.44 20.46

0.45 0.47

0.50 0.47

0.50 0.46

0.50 0.49

0.43 0.46

0.48 0.47

0.43

0.44

0.43

0.45

0.41

0.44

Affective functioning HADS depressive symptoms a GHQ-12 psychological distress a SF mental health b Behavioral dysfunctioning SF physical functioning b SF role functioning b SF social funtioning b Somatic symptoms SF bodily pain a SCL somatic symptoms a Chronic medical morbidity Number of conditions a

751 (n 5 1362)

Total sample (n 5 5279)

Note: All correlations significant at p , 0.001. a Higher scores indicate poorer health. b Higher scores indicate better health.

Generally, the correlations were somewhat higher for subjects aged between 65 and 74 compared to subjects aged 75 and older. The differences between men and women were minimal. The results of the multiple regression analyses of the unique contribution of the domain-specific measures to perceived overall health are presented in Table 4. For the total sample, 41.8% of the variance in perceived overall health was explained by the domain-specific measures of health, and all beta coefficients were statistically significant except for somatic symptoms. Highest beta coefficients were observed for bodily pain (0.19), chronic medical morbidity (0.18), and physical functioning (20.12). This was fairly consistent for all subgroups except for physical functioning in the oldest group. The amount of variance explained by the three measures for affective functioning (depressive symptoms, psychological distress, and mental health) in perceived overall health in the total sample was 21.4% ( p , 0.001) (not in the table). For behavioral dysfunctioning caused by health problems (physical functioning, role functioning, and social functioning), somatic symptoms (bodily pain and somatic symptoms), and chronic medical morbidity these percentages were 29.8, 28.8, and 19.4, respectively (all p , 0.001). Although the amount of variance explained by chronic medical morbidity only was relatively low, the unique contribution of this variable to perceived overall health was relatively high (beta coefficient of 0.18, see Table 4). The difference in explained variance between men and women separately was very small. The amount of explained variance for the oldest subjects was lower than for the other two age groups. However, an additional analysis including

all subjects (not in the table) showed that only 0.6% variance (F-change: 24.9, p , 0.001) in perceived overall health was added by age and gender when the domainspecific health measures were already included in the regression equation. CONCLUSION AND DISCUSSION The main objective of this study was to analyze the relationship of domain-specific measures to perceived overall health cross-sectionally. What can be concluded from the results of our study? First, the discriminative power of the perceived overall health rating compared to the domain-specific health measures depends on the level of perceived overall health. Differences between ‘‘good,’’ ‘‘very good,’’ and ‘‘excellent’’ levels of perceived overall health were weakly reflected by domain-specific measures of health. In contrast, differences between ‘‘good,’’ ‘‘fair,’’ and ‘‘poor’’ levels of perceived overall health were moderately to strongly reflected by all domain-specific measures of health, particularly by physical and social functioning and by somatic symptoms. Second, a single-item measure of perceived overall health did not cover a set of domain-specific measurements of health. Although all selected domains of health had significant unique contributions to perceived overall health in the total sample, except for SCL-somatic symptoms, the amount of variance in the single-item perceived health question explained by all domain-specific health measures was not extremely high (41.8%). This may be due to the fact that there is some evidence that single-item measures tend to be more unreliable than multi-item measures. If the

*p , 0.001. a Higher scores indicate poorer health. b Higher scores indicate better health.

Affective functioning HADS depressive symptoms a GHQ-12 psychological distress a SF mental health b Behavioral dysfunctioning SF physical functioning b SF role functioning b SF social funtioning b Somatic symptoms SF bodily pain a SCL somatic symptoms a Chronic medical morbidity Number of conditions a Overall R 2 Overall F

Domain-specific health measures 0.020 0.022 0.021 0.022 0.022 0.022 0.020 0.022 0.019

0.11* 0.05 20.09* 20.11* 20.10* 20.10* 0.16* 0.08* 0.18* 42.2% 177.8*

0.17* 42.6% 232.1*

0.21* 0.03

20.15* 20.05 20.10*

0.08* 0.07* 20.10*

0.017

0.018 0.020

0.021 0.020 0.019

0.019 0.020 0.020

SE b

b

b

SE b

Women (n 5 2967)

Men (n 5 2312)

0.17* 42.2% 142.1*

0.22* 20.02

20.16* 20.05 20.10*

0.11* 0.06 20.09*

b

0.021

0.023 0.027

0.026 0.024 0.024

0.023 0.025 0.026

SE b

57–64 years (n 5 1834)

0.17* 43.5% 170.0*

0.19* 0.02

20.14* 20.07 20.11*

0.13* 0.04 20.07

b

0.020

0.022 0.025

0.024 0.023 0.023

0.022 0.024 0.024

SE b

65–74 years (n 5 2083)

Perceived overall health a

0.19* 37.6% 84.1*

0.14* 0.08

20.07 20.11* 20.12*

0.07 0.10 20.05

b

0.026

0.027 0.031

0.030 0.030 0.028

0.028 0.030 0.030

SE b

751 (n 5 1362)

0.18* 41.8% 399.8*

0.19* 0.03

20.12* 20.07* 20.10*

0.10* 0.06* 20.07*

b

0.013

0.014 0.015

0.015 0.015 0.014

0.014 0.015 0.015

SE b

Total sample (n 5 5279)

TABLE 4. Multiple regression analysis of perceived overall health on nine domain-specific measures of health for men and women, three age groups, and total sample

16 G. I. J. M. Kempen et al.

Specific Measures vs. Perceived Overall Health

single-item on perceived overall health is fairly unreliable, then, from a theoretical point of view, 41.8% may be all of the reliable variance in this measure (the other part could be measurement error). We studied this topic from two perspectives. First, we repeated the regression analyses of Table 4 with the total SF-20 health perceptions scale (five items; internal reliability estimate, the Cronbach alpha, for this scale was 0.89 and the mean of the inter-item correlations was 0.63) as outcome measure instead of the single item perceived health question. The amount of variance explained in the total sample now increased from 41.8% to 55.1%. For the subgroups according to age and gender the increase in variance explained varied from 12.3% to 15.7%. The beta coefficients for SF mental health and SF physical functioning in the total sample decreased to 20.02 (nonsignificant) and 20.04 ( p , 0.01), respectively. The beta coefficient for SCL somatic symptoms increased to 0.12 and became statistically significant ( p , 0.001). However, the beta coefficients for SF bodily pain and chronic medical morbidity remained far highest (0.18 and 0.20, respectively). Although the amount of variance explained increased substantially and the relative contribution of each health domain to the overall measure changed with the 5item scale as outcome measure, a large amount of variance is still not explained. Second, in a previous study the testretest reliability of the six SF-20 subscales in 354 elderly Dutch subjects was analyzed [21]. An additional analysis showed that the 8-week test-retest reliability of the SF single-item measure on perceived overall health was only 0.03 lower than the test-retest reliability of the whole 5item SF-20 subscale on health perceptions (correlation of 0.82 versus 0.85); the difference in the single-item mean score within this 8-week period was far from significant (Student’s t-test, p 5 0.623). This latter result indicates that the error of measurement for the single-item measure is not too high. Therefore, we may conclude that the amount of variance explained by the domain-specific measures in perceived overall health (41.8%) in the present study was moderate and somewhat underestimated due to measurement error. Previous researchers stated that the significant associations between single-item and domain-specific measures of health indicates that a simple, one-item measure is an acceptable method of assessing health [5]. However, our results showed that large amounts of variance in perceived overall health were still not explained. The results were fairly consistent for men and women and for three age groups, although the strength of association between the domain-specific measures and perceived overall health was generally lowest for the oldest participants. We may conclude that a single-item measure of perceived overall health is not equivalent to domain-specific measurements of health. The relatively low amount of explained variance suggests that other variables probably influence levels of perceived overall health independently of domain-specific measurements of health. In addition, some of the predictors

17

were measured quite roughly. For example, the number of medical conditions was simply a summation of a checklist without any fine tuning of severity or nature of conditions. Third, we found relatively high unique contributions of chronic medical morbidity, bodily pain, and physical functioning to perceived overall health. The contributions of affective functioning (particular GHQ-12 psychological distress and SF mental health) were lower. This is fairly consistent with previous research [37]. We may conclude that levels of affective functioning are rather weakly expressed in overall perceived health. The selected domain-specific measures of health were interdependent. The problem with collinear variables in multiple regression analysis is that they provide similar information. As a consequence, it is difficult to separate the effects of the individual variables. However, only two (out of 36) correlations between the selected predictors exceeded 0.60: physical functioning and role functioning (0.62), and mental health and psychological distress (20.62). Additional regression analyses without SF-20 role functioning and GHQ-12 psychological distress showed somewhat different beta coefficients. However, these additional results still supported our conclusions. This implies that bias due to multicollinearity is probably not very strong. In our article we modelled the domain-specific measures of health as independent variables and the global, singleitem question as the outcome measure. With a crosssectional design, however, it is not possible to study causal relationships. One may hypothesize that perceived overall health affects levels of functioning and vice versa. Future longitudinal research can be focused on at least two topics. First, more research is needed to test the sensitivity to change or responsiveness of both overall health measures and domain-specific measures after, for example, major health events or interventions. Second, research should be focused on comparing the predictive power of perceived overall health measures with that of domain-specific measures for health care utilization, institutionalization, and mortality. Results from such studies may yield additional information on whether or not perceived overall health measures and domain-specific measures are comparable. We conclude that global, single-item measures of health and domain-specific measures of health are not exchangeable in evaluation, survey, or epidemiological research. Their association is not very strong and the discriminative power of perceived overall health levels compared to domain-specific measures of health is only moderate to large at the fair/poor end of the perceived overall health spectrum. We thank Dr. Michael VonKorff, Center for Health Studies, Group Health Cooperative of Puget Sound in Seattle, Washington, for his valuable comments on a draft version of this article. The research reported is part of the Groningen Longitudinal Aging study (GLAS). GLAS is conducted by the Northern Centre for Healthcare Research (NCH) and various Departments of the University of Groningen (RUG). The

18

primary departments involved are Health Sciences (Dr. G. I. J. M. Kempen and Prof. Dr. W. J. A. van den Heuvel), Family Medicine (Prof. Dr. B. Meyboom-de Jong), Psychiatry (Prof. Dr. J. Ormel), Sociology (ICS) (Prof. Dr. S. M. Lindenberg), and Human Movement Sciences (Prof. Dr. P. Rispens). Directors of GLAS are Dr. G. I. J. M. Kempen and Prof. Dr. J. Ormel. GLAS and its substudies are financially supported by the Dutch government (through NESTOR), the University of Groningen, the School of Medicine, the Dutch Cancer Foundation (NKB/KWF), and the Netherlands Organization for Scientific Research (NWO). The central office of GLAS is located at the NCH, Antonius Deusinglaan 1, 9713 AV Groningen, The Netherlands.

References 1. U.S. Department of Health, Education and Welfare. Current Estimates from the Health Interview Survey. Washington, DC: U.S. Government Printing Office; 1973. 2. British Office of Population Censuses and Surveys. The General Household Survey. London: HMSO; 1973. 3. Dutch Central Bureau for Statistics. The Health Interview Survey (in Dutch). Voorburg/Heerlen, The Netherlands: CBS; 1989. 4. Wilkin D, Hallam L, Doggett MA. Measures of Need and Outcome for Primary Health Care. Oxford: Medical Publications; 1992. 5. Cunny KA, Perri M. Single-item vs multiple-item measures of health-related quality of life. Psychol Rep 1991; 69: 127– 130. 6. Van Ginneken JKS, Dissevelt AG, Van de Water HPA, Van Sonsbeek JLA. Results of two methods to determine health expectancy in the Netherlands 1981–1985. Soc Sci Med 1991; 32: 1129–1136. 7. Van de Water HPA, Boshuizen HC, Perenboom RJM. Health expectancy in the Netherlands 1983–1990. Eur J Public Health 1996; 6: 21–28. 8. Ware JE, Davies-Avary A, Donald CA. Conceptualization and Measurement of Health for Adults in the Health Insurance Study: Vol. V. General Health Perceptions. Santa Monica: The RAND Corporation; 1978. 9. Mossey JM, Shapiro E. Self-rated health: A predictor of mortality among the elderly. Am J Public Health 1982; 72: 800– 808. 10. Idler EL, Kasl SV, Lemke JH. Self-evaluated health and mortality among the elderly in New Haven, Connecticut, and Iowa and Washington counties, Iowa, 1982–1986. Am J Epidemiol 1990; 131: 91–103. 11. Idler EL, Kasl SV. Health perceptions and survival: Do global evaluations of health status really predict mortality? J Gerontol Soc Sci 1991; 46: s55–s65. 12. Wolinsky FD, Johnson RJ. Perceived health status and mortality among older men and women. J Gerontol Soc Sci 1992; 47: s304–s312. 13. Grant MD, Piotrowski ZH, Chappell R. Self-reported health and survival in the longitudinal study of aging, 1984–1986. J Clin Epidemiol 1995; 48: 375–387. 14. Hays JC, Schoenfeld D, Blazer DG, Gold DT. Global selfratings of health and mortality: Hazard in the North Carolina Piedmont. J Clin Epidemiol 1996; 49: 969–979. 15. Idler EL, Kasl SV. Self-ratings of health: Do they also predict change in functional ability? J Gerontol Soc Sci 1995; 50: s344–s353.

G. I. J. M. Kempen et al.

16. Branch L, Jette DA, Evashwick C, Polansky M, Rowe G, Diehr P. Toward understanding elders’ health service utilization. J Community Health 1981; 7: 80–92. 17. Wan TTH, Arling G. Differential use of health services among disabled elderly. Research on Aging 1983; 3: 411–431. 18. Evashwick C, Rowe G, Diehr P, Branch L. Factors explaining the use of health care services by the elderly. Health Services Research 1984; 19: 357–382. 19. Read JL, Quinn RJ, Hoefer MA. Measuring overall health: An evaluation of three important approaches. J Chron Dis 1987; 40: 7S–19S. 20. Leavey R, Wilkin D. A comparison of two health survey measures of health status. Soc Sci Med 1988; 27: 269–275. 21. Kempen GIJM. The MOS Short-form General Health Survey: Single item vs multiple measures of health-related quality of life; some nuances. Psychol Rep 1992; 70: 608–610. 22. Krause NM, Jay GM. What do global self-rated health items measure? Med Care 1994; 32: 930–942. 23. Kempen GIJM, Jelicic M, Ormel J. Personality, chronic medical morbidity and health-related quality of life among older persons. Health Psychology. (In press) 24. Kempen GIJM, Ormel J, Brilman EI, Relyveld J. Adaptive responses among Dutch elderly: The impact of eight chronic medical conditions on health-related quality of life. Am J Public Health 1997; 87: 38–44. 25. Folstein MF, Folstein SE, McHugh PR. Mini-Mental State; A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res 1975; 12: 189–198. 26. Stewart AL, Hays RD, Ware JE. The MOS Short-form General Health Survey—reliability and validity in a patient population. Med Care 1988; 26: 724–735. 27. Zigmond AS, Snaith RP. The hospital anxiety and depression scale. Acta Psychiatrica Scandinavia 1983; 67: 361–370. 28. Spinhoven Ph, Ormel J, Sloeckers PPA, Kempen GIJM, Speckens AEM, Van Hemert AM. A validation study of the Hospital Anxiety and Depression Scale (HADS) in different groups of Dutch subjects. Psychological Medicine 1997; 27: 363–370. 29. Ware JE, Sherbourne CD. The MOS 36-item short-form health survey (SF-36); I. Conceptual framework and item selection. Med Care 1992; 30: 473–483. 30. Goldberg D, Williams P. A User’s guide to the General Health Questionnaire. Winsor: NFER-Nelson; 1988. 31. Derogatis LR, Rickels K, Rock AF. The SCL-90 and the MMPI: A step in the validation of a new self-report scale. Br J Psychiat 1976; 128: 280–289. 32. Van den Bos GAM. The burden of chronic diseases in terms of disability, use of health care and healthy life expectancies. Eur J Public Health 1995; 5: 29–34. 33. Cohen J. A power primer. Psychological Bulletin 1992; 112: 155–159. 34. Van Bennekom CAM, Jelles F, Lankhorst GJ, Bouter LM. Responsiveness of the Rehabilitation Activities Profile and the Barthel Index. J Clin Epidemiol 1996; 49: 39–44. 35. Garssen B, Hornsveld H. Power analysis (in Dutch). Gedragstherapie 1992; 25: 107–121. 36. Moum T. Self-assessed health among Norwegian adults. Soc Sci Med 1992; 35: 935–947. 37. Davies AR, Ware JE. Measuring Health Perceptions in the Health Insurance Experiment. Santa Monica: The RAND Corporation; 1981.