Journal of Clinical Epidemiology 54 (2001) 1079–1080
COMMENTARY
Scaling the heights of quality of life Robert L. Kane* University of Minnesota School of Public Health, Minneapolis, MN, USA Received 21 March 2001; accepted 21 March 2001
Abstract Measuring always involves abstracting reality. Measuring an abstraction like quality of life is a daunting task. Numerous conceptual and methodological issues must be addressed, but the challenges should not deter the journey. © 2001 Elsevier Science Inc. All rights reserved.
While focusing attention on the centrality of quality of life (QoL) in assessing the effectiveness of health care is a laudable goal, it may also be viewed as a Promethean act of hubris. Encouraging clinicians and clinical researchers to expand their definition of clinical success to address the larger issues that affect a person’s life, even restricting them to what can reasonably be influenced by health care (usually labeled as health-related QoL), serves a useful purpose. Nonetheless, measuring always involves abstracting reality. Measuring a vague concept with an abstraction is even more daunting. There is no unanimously agreed upon definition of quality of life. Indeed, the very idea evokes very personal individual decisions [1,2]. Efforts to generate consensus about just what constitutes quality of life have achieved some levels of agreement, but history teaches the difference between consensus and wisdom. The challenge is made all the more difficult by the end users’ insistence that the measures be brief, lest the burden of assessment interfere with collecting other more valued data. The search for the SF-1 continues! The very idea of trying to capture the elements of a life in a few questions seems ludicrous at first blush. Despite its importance, not many have ventured into this arena. Work in this area has generally been concentrated in a few centers. One of these has even appointed itself a clearinghouse for certifying the quality of work in this area. The challenges to measuring QoL are several. The first lies in capturing the essential elements in a form and format that means the same thing to different people. The second lies in developing measures that can reflect each domain approached. The third involves determining whether the same * Corresponding author. University of Minnesota School of Public Health, Mayo Mail Stop 197 (Room D351), 420 Delaare Street Southeast, Minneapolis, MN 55455-0374, USA. Tel.: 612-624-1185; fax: 612-6248448.
domains (and questions) hold the same importance for different respondents. Progress is best reported as mixed. There is some degree of agreement about what domains constitute QoL, but there is also a fair amount of difference in coverage, and certainly in emphasis [3]. Moreover, it is not at all clear that the same domains apply to everyone, and there are certainly large questions about whether the emphasis should be the same for all populations. For example, would one use the same constructs to address QoL among nursing home residents that one used to address the results of a surgical procedure in otherwise healthy young adults? If so, would some aspects that are taken for granted among the well need to be explicitly explored among the frail? An unresolved question is whose values should be used to establish QoL. Should efforts be made to tap the values of a representative sample of: the population affected, everyone, policymakers? Can any mean value accurately portray an individual’s values? Should it? At one level, QoL is an immensely personal concept. No stranger can determine what represents QoL for me or what aspects of that construct are most important to me. On the other hand, a major rationale for emphasizing QoL is to provide policy tools that employ a common metric to facilitate comparisons. Allowing every respondent to set his/her own value weights would undermine comparability. The work on QoL has taken two general routes. In an effort to carefully examine the relative preference assigned to various states, investigators have used a variety of comparative techniques designed to increase the reliability of assumptions about the ordinal, and even ratio, properties of the elements. Unfortunately these have usually relied upon rather artificial scenarios that encourage respondents to engage in abstract thinking [4–6]. Even when people with serious disease were queried, their alternatives were artificial [7]. This approach results in mathematical sophistication
0895-4356/01/$ – see front matter © 2001 Elsevier Science Inc. All rights reserved. PII: S0895-4356(01)00 3 9 4 - 8
1080
R.L. Kane / Journal of Clinical Epidemiology 54 (2001) 1079–1080
being applied to unrealistic material. The other strategy employs cruder methods to more real-world problems. For example, respondents are asked to rate or rank experiences or states that they have experienced. QoL can be approached as both a generic measure and a disease-specific one. The former is designed to be used across all conditions, whereas the latter has a more constrained application. In exchange, the disease-specific measure should capture more of the elements within a constant set of domains (or constructs) in more detail, thereby permitting sharper discrimination. The work presented in the article by Wong, et al. reflects many of these tensions. It tests the proposition that restricting QoL items to those problems actually experienced by an individual will change the resultant QoL score, or its ability to discriminate. The underlying methods reflect the state of the field. To validate the effects, they use comparisons with elements of three different extant QoL measures. Basing the validation comparison on three different versions of what are purportedly reflections of the true QoL level suggests that our ability to detect QoL is flawed. We are relying on three different shadows of a reality we cannot see. Moreover, the investigators have chosen to include in their summary QoL measure the sum of responses to questions about symptoms, either overall or censured by actual experience. In addition to the symptoms, they also examined activity, leisure, and emotional function for inflammatory bowel syndrome patients, or emotion, leisure, sexuality, vocational issues, and interactions with family and friends for patients with polycystic ovary syndrome. In a sense, using only symptoms that patients experienced is a form of weighting, whereby nonexperienced symptoms are assigned a value of zero; but it is an incomplete approach. Even among the symptoms patients suffer, some may be more important than others or have a greater impact on QoL. Research suggests that weighting does matter. Not only do different groups apply different weights, the results of using those weights can produce different results. In a study of post-acute care, the results differed when patient and professional weights were applied to the Activities of Daily Living (ADL) scale [8]. In a study of QoL for patients with fecal incontinence, a similar difference in outcomes was observed when the weights generated by patients and providers were used [9]. Allowing each patient to rate only those symptoms s/he experienced is a first step toward developing a more individualized QoL score. Whereas it is difficult to compare scales weighted by each individual, it seems reasonable to include only those symptoms actually experienced, because the occurrence of the symptoms can be considered part of the
outcome itself. It is, therefore, disappointing to note that the results Wong et al. report using all items fared better than those using only the problem items. The problem may lie with the validation criteria. While imperfect, the physical subscales of the SF-36 and the Sickness Impact Profile (SIP) seem reasonable criteria. However, these two scales were at least as (and often more) highly correlated with measures of other aspects such as activity, leisure, and vocational issues. The SIP has been subjected to modest value preference weights [10]. The SF-36 has not. Should these results, which fail to show better concordance with reference generic measures that do not reflect individual values, spur a reassessment of a philosophic position that emphasizes the importance of examining individuals’ values? Should the concept of working toward more individualized responses be abandoned? We have already acknowledged that, for policy analysis purposes, it is awkward to use individual value weights because it impedes comparisons, but if the underlying philosophic directive emphasized the goal of meeting individually weighted, as opposed to socially determined, outcomes, the statistical difficulty could be deemed worth the effort.
References [1] Gill TM, Feinstein AR. A critical appraisal of the quality of qualityof-life measurements. JAMA 1994;272:619–26. [2] Lara-Munoz C, Feinstein AR. How should quality of life be measured? J Investig Med 1999;47(1):17–24. [3] Frytak JR. Assessment of quality of life in older adults. In: Kane RL, Kane RA, editors. Assessing older persons: measures, meaning, and practical applications. New York: Oxford University Press, 2000. p. 200–36. [4] Kaplan RM, Feeny D, Revicki D. Methods for assessing relative importance in preference based outcome measures. Quality Life Res 1993;2(6):467–75. [5] Feeny D, Furlong W, Boyle M, Torrance GW. Multi-attribute health status classification systems: Health Utilities Index. Pharmacoeconomics 1995;6:490–502. [6] Feeny D. A utility approach to assessing health-related quality of life. Med Care 2000;38(9, Suppl II):151–4. [7] Torrance GW, Furlong W, Feeny DH, Boyle M. Multiattribute preferences functions: Health Utilities Index. Pharmacoeconomics 1995; 6:503–20. [8] Chen Q, Kane RL. Effects of using consumer and expert ratings of an activities of daily living scale on predicting functional outcomes of postacute care. J Clin Epidemiol 2001;54(4):334–42. [9] Rockwood TH, Church JM, Fleshman JW, Kane RL, Mavrantonis C, Thorsen AG, Wexner SD, Lowry AC. FIQL: a quality of life instrument for patients with fecal incontinence. Diseases Colon Rectum 2000;43(1):9–17. [10] Bergner M, Bobbit RA, Carter WB, Gilson BS. The Sickness Impact Profile: development and final revision of a health status measure. Medical Care 1981;19(8):787–805.