Performance-integrated self-report measurement of physical ability

Performance-integrated self-report measurement of physical ability

The Spine Journal 10 (2010) 433–440 Technical Report Performance-integrated self-report measurement of physical ability Vert Mooney, MDa,b, Leonard ...

484KB Sizes 0 Downloads 20 Views

The Spine Journal 10 (2010) 433–440

Technical Report

Performance-integrated self-report measurement of physical ability Vert Mooney, MDa,b, Leonard N. Matheson, PhDa,c,*, Joe Verna, DCa, Scott Leggett, MSa, Thomas E. Dreisinger, PhDa, John M. Mayer, DC, PhDa,d a U.S. Spine & Sport Foundation, San Diego, CA 92123, USA Department of Orthopaedic Surgery, School of Medicine, University of California, San Diego, CA 92103, USA c EpicRehab, Saint Charles, MO 63303, USA d School of Physical Therapy & Rehabilitation Sciences, College of Medicine, University of South Florida, Tampa, FL 33612, USA b

Received 22 December 2009; accepted 5 February 2010

Abstract

BACKGROUND CONTEXT: The technology of self-report measures has advanced rapidly over the past few years. Recently, this technology was used to develop a performance-integrated selfreport measure for use with patients with musculoskeletal impairments that may lead to work disability. Psychometric studies of the new measure in patient populations have been successful. A validation study of the measure with adults in good general health is necessary. PURPOSE: The purpose of this study was to assess the concurrent validity of a new performanceintegrated self-report measure, the multidimensional task ability profile (MTAP). STUDY DESIGN/SETTING: A prospective validation study was conducted in which a selfreport measure was administered online, and a physical performance test was administered at various clinics in North America. PATIENT SAMPLE: One hundred ninety-six (34% male) adult volunteers in good general health participated in this study. OUTCOME MEASURES: Self-report measure—MTAP. Physiologic measure—EPIC Lift Capacity test. METHODS: The MTAP was administered online within 1 week of formal testing of lift capacity using a standardized lift capacity test, the EPIC Lift Capacity test. MTAP scores were compared with performance on the EPIC Lift Capacity test. Stepwise regression analysis was used to identify the strength of the relationship between the two measures and the relative explanation of lift capacity variance by the MTAP score, along with gender and age. RESULTS: The combination of MTAP score, gender, and age demonstrated a regression coefficient of R50.82, which accounts for 67.3% of the variance in lift capacity. CONCLUSIONS: The MTAP displayed good concurrent validity compared with actual physical performance as assessed by the EPIC Lift Capacity test. Modern performance-integrated self-report measures, such as the MTAP, have the potential to provide information about functional capacity that is sufficiently useful to confirm status and help guide treatment algorithms. Ó 2010 Elsevier Inc. All rights reserved.

Keywords:

Self-report measures; Functional capacity; Physical function; Disability

FDA device/drug status: not applicable. Author disclosures: LNM (stock ownership, including options and warrants, Multidimensional Task Ability Profile questionnaire, EPIC Lift Capacity test; board of directors, US Spine and Sport Foundation); JMM (consulting, Palladian Muscular Skeletal Health; scientific advisory board, Palladian Muscular Skeletal Health; other office, US Spine & Sport Foundation; research support: investigator salary, Johnson and Johnson; research support: staff and/or materials, Johnson and Johnson; grants, Johnson and Johnson). * Corresponding author. EpicRehab, 188 Woodlands Place Court, Saint Charles, MO 63303, USA. Tel.: (636) 724-4556; fax: (636) 898-0954. E-mail address: [email protected] (L.N. Matheson) 1529-9430/$ – see front matter Ó 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.spinee.2010.02.010

Introduction Over the past 20 years, there has been a strong trend in health care for self-report measures to become more integrated in diagnostic and treatment processes. Self-report measures are well established in certain disciplines, especially psychiatry and clinical psychology, where they are used routinely to facilitate diagnosis. Beginning with funding for the RAND Corporation in 1990 to develop a selfreport measure to facilitate decision making (and funding) for hip replacement surgery (the Short Form-36 [1]), there

434

V. Mooney et al. / The Spine Journal 10 (2010) 433–440

has been sustained US government interest in self-report measures to determine the cost-effectiveness of health treatments. Over the past 7 years, the National Institutes of Health has provided support for the Patient-Reported Outcomes Measurement Information System (PROMIS) project [2], which has coincided with other projects sponsored by the Centers for Medicare & Medicaid Services (CMS) [3–6] that use self-report measures as key components of algorithm-driven medical treatment. In these programs, evidence-based medicine is implemented with algorithms that use self-report measures, and the measures also are used to assess progress and outcomes. The American Medical Association’s (AMA) Guides to the Evaluation of Permanent Impairment [7] have recently incorporated self-report measures but have not fully embraced the use of these measures. Self-report measures are included as options in the current edition of the Guides, with two disability measures, the short form of the Disabilities of the Arm, Shoulder, and Hand (DASH) [8–10] questionnaire, and the Pain Disability Questionnaire (PDQ) [11,12], included in several examples in the Upper Extremity chapter and in the Spine chapter, respectively. Although this may be a ‘‘step in the right direction,’’ the use of disability measures such as these is only a partial and inadequate solution to the validity problems in the Guides that self-report measures can address. To understand the potential value of self-report measures, it is necessary to distinguish between measures of disability, such as the DASH and PDQ, and those that measure ability, such as the multidimensional task ability profile (MTAP) [13,14], which is the focus of the current research. Disability is more difficult to measure than ability because the former construct is more complex than the latter, requiring measures of disability to be more complex. The AMA has recognized this complexity by adopting the International Classification of Functioning, Disability, and Health as its conceptual model for the Guides [7]. ‘‘The ICF model appears to be the best model for the Guides. It acknowledges the complex and dynamic interactions between an individual with a given health condition, the environment, and personal factors.’’ (p3) Thus,

measures such as the DASH and PDQ must include items that query these ‘‘complex and dynamic interactions.’’ In addition to the issue of complexity, the construct of disability also considers the effect of symptoms on function, which causes these measures to be inherently more subjective than are measures of ability. The inclusion of factors that are necessarily subjective is an important concern when financial gain can be affected by the misuse of self-report measures. The DASH and PDQ include physical ability items along with items that measure symptoms to generate a total score that reflects more than just physical ability. Because these measures include subjective items, they are considered less trustworthy, whereas measures of ability based on performance are accepted more readily because they are considered to be objective [15–19]. However, the juxtaposition of subjective measures against objective measures is artificial; these exist along a continuum that is bidimensional, combining both objective and subjective components such as those described in Fig. 1. In this model, measures of ability that are based on performance tend to be more objective than subjective, whereas measures of ability that are based on self-report tend to be more subjective than objective, but each type of measure has subjective and objective components. For example, performance ability measures are volitional, which contributes a significant subjective component, whereas self-report ability measures are based on experience, which contributes a significant objective component. Recently, a new type of self-report measure has been developed that takes advantage of the bidimensional nature of ability, the performance-integrated self-report measure. A performance-integrated self-report measure includes only items that are linked to demonstrable physical ability, with items that are quantified on an interval scale. Performanceintegrated self-report measures use item response theory (IRT) to empirically calibrate items and provide interval data, much like performance measures, allowing mathematical comparisons. The first performance-integrated self-report measure is the MTAP, which is composed of items and a rating scale, both of which are linked to demonstrable performance.

Fig. 1. Bidimensional model of ability.

V. Mooney et al. / The Spine Journal 10 (2010) 433–440

The present study examined the validity of the MTAP as a performance-integrated self-report measure for predicting lift capacity. In the evaluation and treatment of persons with spinal impairment, lift capacity is often an important issue, especially when return to work is being considered. The ability to perform lifting tasks depends on three human systems, the cardiovascular, musculoskeletal, and psychophysical systems, all operating simultaneously. The field of ergonomics has studied each of these systems and generally accepts that the primacy of one system is based on the operational design of the lifting test [20–28]. For example, a brief isometric test of lift capacity emphasizes the musculoskeletal system and minimizes the cardiovascular and psychophysical systems. Conversely, a progressive lift capacity test that continuously samples heart rate and cognitive appraisal of the load emphasizes the psychophysical system. The developers of the EPIC Lift Capacity test studied in the current research designed the test to be sensitive to psychophysical limits to optimize safety, which was considered to be especially important because the test is normally used with people who have medical impairments as a consequence of injury [29,30]. Given the design of the EPIC Lift Capacity test to emphasize the psychophysical system, it seems reasonable to expect that a performance-integrated self-report measure, which by definition taps the psychophysical domain, would produce results that would be significantly related to physical test performance. The purpose of this study was to test the hypothesis that selfreport assessment of physical function is significantly related to physical test performance.

Methods Subjects Adult volunteers (n5196 subjects; 66 male; 130 female) in good general health were recruited to participate in this study. Volunteers were recruited as part of the training and certification program for the EPIC Lift Capacity test [31] between May 1, 2008 and May 25, 2009. Consenting and testing of the subjects took place at various clinics around North America and Australia, where the EPIC Lift Capacity test equipment was available, with MTAP testing online. Inclusion criteria were English-reading males or females age 18 years and older, self-reported to be in good health following review of a health questionnaire (below) and available to be retested approximately 7 days after the initial test. Exclusion criteria were age younger than 18 years, inability to read English, unresolved health issues after review of health questionnaire, such as heart disease or hypertension, and unable to be retested within 10 days. Each subject provided informed consent before participation. The study was reviewed and approved by the second author’s institutional review board.

435

Procedures Physical tests and self-report measures were administered by candidates for certification as an evaluator. To become an EPIC Lift Capacity test certified evaluator, the certificant must participate in a formal training program that requires 6 hours to 8 hours, pass an ‘‘open book’’ written examination, and perform testing, followed by retesting 7 days later of five healthy subjects. The certificant must be a fully qualified health-care professional such as a physical therapist, kinesiotherapist, or occupational therapist. In addition to the collection of basic demographic information and information about physical job demands, each volunteer test subject is screened by the certificant through the use of a 42-item health screening questionnaire based on the Cornell Medical Index Health Questionnaire [32], with items that focus on physical restrictions and cardiovascular signs and symptoms. After completion of the health screening questionnaire, all ‘‘Yes’’ responses are reviewed with the test subject by the certificant to assure that maximum physical performance testing can be undertaken with acceptable safety. Immediately before the retest approximately 7 days after the initial test, another health screening questionnaire is administered that inquires about adverse events from the initial test. This is reviewed by the certificant to assure that the retest can be undertaken with acceptable safety. The training and certification program has been in use for 15 years, with test-retest data collected on approximately 8,000 healthy volunteers in North America, Australia, and Asia, with no reports of injury, although the initial test often produces delayed onset muscle soreness. In 2008, after modification of the informed consent procedures and institutional review board approval, the EPIC Lift Capacity certification program was amended to include the MTAP, which was administered to the volunteer test subject after completion of the retest through the use of the World Wide Web. MTAP test results were automatically transmitted on a secure basis and subsequently married to the EPIC Lift Capacity test results by clerical staff.

Measures Lift capacity—The EPIC Lift Capacity test is a progressive-demand test of the evaluee’s ‘‘maximum acceptable weight’’ over three vertical ranges at both one repetition per cycle and four repetitions per cycle, using standardized equipment and procedures. Beginning with a lift from knuckle height to shoulder height at one-repetitionper-cycle with a 4.54-kg masked weight (the evaluee is naı¨ve about the starting weight and each weight increment), the load is increased in 4.54-kg increments. Heart rate, high-risk workstyle, and psychophysical appraisal are monitored at the end of each lift-lower sequence. Decision rules

436

V. Mooney et al. / The Spine Journal 10 (2010) 433–440

guide the evaluator to terminate weight progression in a standardized manner and move on to the next of six subtests or conclude the test battery. Several indicators of less than full effort are monitored. Nine values are obtained that describe performance on the EPIC Lift Capacity test, a maximum acceptable weight for each of the six subtests, total weight for the occasional subtests, total weight for the frequent subtests, and total weight for the test battery. Published studies have demonstrated good reliability and validity [29,30,33,34]. Self-report physical ability—The MTAP [13] is a computer-administered and scored performance-integrated self-report questionnaire composed of 50 items focused on physical work capacity. Each item is a combination of a short text task description and a simple pictorial representation of an adult performing the task, such as in Fig. 2. All tasks are physically demonstrable, with several that include lifting progressive loads over vertical range of motion that correspond to the EPIC Lift Capacity test. Items are presented one at a time, beginning with very low physical demand and increasing to a physical demand level that is consistent with the US Department of Labor’s ‘‘very heavy’’ physical demand characteristics level. The evaluee responds to each item through the use of a five-

point rating scale, from ‘‘Able’’ to ‘‘Unable,’’ with a ‘‘don’t know’’ option. The MTAP was developed through the use of the Andrich rating scale version of the Rasch IRT model [35,36] in research previously conducted by the authors, so that the response characteristics of each item have been empirically derived, as are the values of the rating scale. This information allowed rational assignment of items to cover the full range of expected ability for the medically impaired population that the instrument was designed to assess. It also allowed development of two statistical methods of screening for consistent patterns of performance, use of the ‘‘infit’’ and ‘‘outfit’’ statistics. Published studies have demonstrated good reliability and validity [14,37,38]. Data analysis During the testing process, demographic and lift capacity data were recorded by the certificant using forms designed for the study. The forms were copied by the certificant and sent by mail to the second author’s research office where they were reviewed for completeness by an appropriately trained office worker and married to the MTAP data. MTAP data were collected after the retest online from the certificant’s office through the use of a secure server and transmitted on an encrypted basis to the research office. Data were recorded in an excel spreadsheet. Data analyses, conducted with SPSS 17.0 (SPSS Inc., Chicago, IL, USA), began with separate one-way analyses of variance to examine gender effects on MTAP scores and EPIC Lift Capacity test performance, followed by Pearson product-moment correlations to examine age effects on these variables. Subsequently, forward stepwise multiple regression was used to examine the predictive relationship among MTAP score, age, and gender on lift test performance.

Results

Fig. 2. Sample multidimensional task ability profile item: ‘‘Change a light-bulb overhead.’’

Over a 12-month period, 196 subjects (66 male; 34%) completed the EPIC Lift Capacity test and retest, and the MTAP test. Demographic and basic test data are presented in Table 1. Separate one-way analyses of variance demonstrated that there was no significant difference between males and females in terms of age (F1,19551.23, p5.27), but significant differences were found based on gender in terms of lift capacity (F1,1945299.81, p!.001) and MTAP score (F1,1955108.05, p!.001). These results are depicted in Table 2. The gender differences are consistent with previous research with these instruments; in both self-report and performance tests that involve physical strength, males typically score higher than females. The Pearson product-moment correlations between each EPIC Lift Capacity subtest and the EPIC lift capacity totals were calculated, with results in Table 3. All correlations are

V. Mooney et al. / The Spine Journal 10 (2010) 433–440

437

Table 1 Subjects’ demographic and basic test data Variable

N

Mean

Standard deviation

Minimum

Maximum

Age (y)

Male Female Total

66 131 197

30.3 31.8 31.3

8.2 9.3 9.0

18 17 17

55 60 60

EPIC lift capacity total (0–272.7 kg)

Male Female Total

66 130 196

201.9 116.6 145.3

36.6 30.3 51.8

109.1 47.7 47.7

272.7 190.9 272.7

MTAP total (0–200)

Male Female Total

66 131 197

197.2 179.8 185.6

4.4 13.3 13.8

175 122 122

200 200 200

MTAP, multidimensional task ability profile.

significant at p!.001. Given the stability of the MTAP total scores, and in an attempt to minimize the time lag between the administration of the EPIC lift capacity and MTAP, the retest total was selected as the dependent variable for subsequent regression analyses, with gender, age, and MTAP Total score as the predictor variables. The Pearson product-moment correlation between the EPIC Lift Capacity retest total and the MTAP total was r50.64, which is significant at p!.001. The relative contributions of each variable were studied through the use of stepwise forward multiple regression procedures, with gender coded as a dummy variable. Results are presented in Table 4. In the stepwise forward regression procedure, the computer selects the order of entry of each predictor variable on an iterative basis in terms of the magnitude of each variable’s nonredundant contribution to the dependent variable, until there is no longer a significant change in the F ratio. In the present case, all three predictor variables made significant contributions to the dependent variable. Gender was entered first, followed by the MTAP total score and age, in that order. The combination of the three predictor variables demonstrated a regression coefficient of R50.82, which accounts for 67.3% of the variance in lift capacity.

Discussion The present study demonstrated the validity of a performance-integrated self-report measure of ability in terms of a performance measure of lift capacity. The ability of a performance-integrated self-report to account for 67.3% of the variance in a physical ability test blurs the distinction between subjective and objective ability measures. As such, this result supports a bidimensional model of ability somewhat different from the model proposed earlier. At the center of the continuum depicted in Fig. 3, the self-report measure of ability shares a substantial proportion of the variance of the lift capacity test. It is likely that performanceintegrated self-report measures of ability and performance measures of ability share a common source. This result has many important potential consequences, one of which is to improve the reliability and validity of the AMA Guides process. Real-time self-report of ability, along with parallel rating by the physician that is designed to fit the impairment evaluation process for each diagnostic category will improve the reliability of the process. Because the performance-integrated self-report items describe demonstrable function, immediate confirmation or disconfirmation of ability limitations can occur in the physician’s office, improving reliability. Because reliability puts a technical ceiling on validity [39], improvement in

Table 2 Subjects’ ANOVA gender comparisons Variable

Sum of squares

df

Mean square

F

Significance

Age

Between Within Total

98.52 15640.22 15738.74

1 195 196

98.52 80.21

1.23

0.27

EPIC lift capacity total

Between Within Total

1539843.87 996387.76 2536231.63

1 194 195

1539843.87 5136.02

299.81

0.00

MTAP total

Between Within Total

13385.43 24157.26 37542.68

1 195 196

13385.43 123.88

108.05

0.00

ANOVA, analysis of variance; df, degrees of freedom; MTAP, multidimensional task ability profile.

438

V. Mooney et al. / The Spine Journal 10 (2010) 433–440

Table 3 Correlation of EPIC lift capacity test-retest maximum acceptable weight (kilograms) Test Test Test Test Test Test Test Test

1 2 3 4 5 6 total

N

Mean

Standard deviation

Retest

197 197 197 197 197 197 197

25.4 29.0 24.9 20.8 23.4 19.4 143.0

9.5 9.7 9.9 9.0 9.0 8.9 52.5

Retest Retest Retest Retest Retest Retest Retest

validity will occur, based on the items’ sampling of abilities in the patient’s usual and customary tasks. An important consequence of these changes in the Guides process will be improved understanding and acceptance by the patient, with diminished litigation as a potential result. If litigation ensues, because the process included the patient’s self-report that had been corroborated during the process by the physician, there are likely to be fewer opportunities for attorneys to successfully dispute the physician’s opinion. Another potential consequence of the shared variance of performance-integrated self-report ability measures and performance measures is the use of performanceintegrated self-report data in intervention algorithms for evidence-based medicine so that the information from the self-report is formally integrated in the process. For example, the real-time use of performance-integrated self-report ability measures by the patient in parallel with the physician performance-integrated self-report rating will identify mismatches that reflect misunderstanding by the patient or physician. The physician’s resolution of the mismatch during the examination should sharpen the diagnosis and improve intervention and patient compliance. This is only a technical adjustment in current physician practice that relies on spoken self-report. The inclusion of a computerized adaptive testing process using a performance-integrated self-report in parallel by both patient and physician is the basis of a ‘‘smart system’’ using a form of artificial intelligence that will lead directly to intervention based on bestpractice algorithms. The potential to use a performance-integrated self-report to predict physical performance is attractive for several technical and procedural reasons. The self-report measurement process neither exposes the evaluee to risk of injury nor requires highly trained personnel and is therefore likely to be less expensive than performance measurement.

1 2 3 4 5 6 total

N

Mean

Standard deviation

r

197 197 196 197 196 196 196

25.8 29.1 25.5 21.1 23.8 19.7 145.3

9.5 9.9 9.5 8.4 8.7 8.5 5.18

0.94 0.93 0.94 0.93 0.91 0.92 0.97

Although many performance-integrated self-report items are linked to the performance measure, other items that are cocalibrated but linked to tasks outside the testing environment will facilitate generalization beyond the boundaries of the performance evaluation equipment and procedures. Self-report measures are more sensitive to inconsistent patterns and readily lend themselves to computerized model fitting than performance measures, thereby improving reliability. Self-report data usually are collected more quickly than physical performance data and can be more easily verified, including consideration of patterns of responding that require multiple individual measures. Combined with screening for inconsistency through the use of the ‘‘infit’’ and ‘‘outfit’’ statistics that the IRT approach makes available, sensitivity to less than full effort responding is greatly improved. These results also argue for the development of other performance-integrated self-report measures that are tied to highly reliable performance measures in domains of function beyond musculoskeletal, such as cardiac function and pulmonary function and neurological function. These yoked measures would allow cross-validation because of the substantial variance that they are likely to share. For example, in the present study, the reliability of the lift capacity measure (r50.97) results in a very small standard error of measurement. Coupled with the large proportion of the lift capacity test’s variance accounted for by the performance-integrated self-report score, the latter can be used to calculate a dependable range of expected lift capacity test performance. If a patient performs the lift capacity test outside the predicted range, the inherent relationship between the two tests allows the clinician to confidently challenge the discrepancy. It can reasonably be expected that performance-integrated self-report measures linked to other domains of function would be similarly correlated, offering cross-validation throughout the Guides processes.

Table 4 Prediction of lift capacity—EPIC lift capacity retest total Change statistics Predictor variable(s)

R

R square

Adjusted R square

R square change

F change

df1

df2

Significant F change

Gender Gender, MTAP total Gender, MTAP total, age

.779 .810 .820

.607 .656 .673

.605 .653 .668

.607 .049 .017

299.813 27.590 9.830

1 1 1

194 193 192

.000 .000 .002

MTAP, multidimensional task ability profile; df, degrees of freedom.

V. Mooney et al. / The Spine Journal 10 (2010) 433–440

439

Fig. 3. Bidimensional model of ability demonstrating shared variance.

Finally, because performance-integrated self-report measures can sample a broad range of human abilities, they have the potential to be more valid as indicators of impairment that is relevant to participation in the real world than do performance measures because each patient’s ability profile is unique. Given the time pressure of current practice that mitigates against personalized care, performanceintegrated self-report measures make this more possible because the library of items that can be selected by a computerized adaptive testing system is theoretically unlimited, including tasks that require abilities that are beyond the health-care professional’s understanding. Rather than try to make every professional an expert in the demands of a patient’s job or family, a computerized adaptive testing performance-integrated self-report measure can include relevant items that have been empirically calibrated. The performance-integrated self-report measure used in the present study is not only linked to the lift capacity performance measure, it is also linked to several scales that reflect a wide range of activities of daily living and job demands described by the US Department of Labor physical demand characteristics system [40]. In focus-group research with the MTAP, patients typically report that they appreciate the opportunity it provides to have their experience outside of the clinic accurately reflected.

Limitations of the present study Both of the measures in the present study were developed for use with people with medical impairments. Thus, in the present study with healthy subjects, both measures demonstrated a ‘‘ceiling effect,’’ depicted in Fig. 4. The Fig. 4 scatterplot presents each subject in the study based on EPIC Lift Capacity retest total and the MTAP total score. Inspection of the plot indicates that several subjects achieved maximum scores on both tests, 272.73 kg on the EPIC Lift Capacity and a score of 200 on the MTAP. In IRT terms, perfect scores provide no useful information because the test is thought to be insufficiently matched to the evaluee. It is most apparent with the MTAP, suggesting that a new version of the measure with items that are more demanding will be more appropriate for studies with healthy subjects. Because the library of items from which this version of MTAP was selected includes higher-demand items that have been calibrated with the existing items, their substitution will be straightforward. Replication of the present study with a revised instrument that includes items that are more demanding is anticipated.

Conclusions A performance-integrated self-report measure, the MTAP, displayed good concurrent validity compared with actual physical performance as assessed by the EPIC Lift Capacity test. Modern performance-integrated self-report measures, such as the MTAP, have the potential to provide information about functional capacity and physical function that is sufficiently useful to confirm status and help guide treatment algorithms. References

Fig. 4. Scatterplot of multidimensional task ability (MTAP) profile total scores and EPIC lift capacity total lift capacity.

[1] Ware JJ, Sherbourne C. The MOS 36-item short-form health survey (SF-36). Med Care 1992;30:473–81. [2] Cella D, Yount S, Rothrock N, et al. On behalf of the PCG. The Patient-Reported Outcomes Measurement Information System (PROMIS): progress of an NIH Roadmap cooperative group during its first two years. Med Care 2007;45:S3–S11.

440

V. Mooney et al. / The Spine Journal 10 (2010) 433–440

[3] Resnik L, Liu D, Mor V, Hart DL. Predictors of physical therapy clinic performance in the treatment of patients with low back pain syndromes. Phys Ther 2008;88:989–1004. [4] Cooper J, Kohlmann T, Michael J, et al. Health outcomes. New quality measure for Medicare. Int J Qual Health Care 2001;13:9–16. [5] Resnik L, Hart D. Using clinical outcomes to identify expert physical therapists. Phys Ther 2003;83:990–1002. [6] Resnik L, Feng Z, Hart D. State regulation and the delivery of physical therapy services. Health Serv Res 2006;41:1296–316. [7] American Medical Association. Guides to the evaluation of permanent impairment. 6th ed. Chicago, IL: American Medical Association, 2008. [8] Hudak PL, Amadio PC, Bombardier C. Development of an upper extremity outcome measure: the DASH. The Upper Extremity Collaborative Group (UECG). Am J Ind Med 1996;29:602–8. [9] Beaton DE, Wright JG, Katz JN. Development of the QuickDASH: comparison of three item-reduction approaches. J Bone Joint Surg Am 2005;87:1038–46. [10] Matheson LN, Melhorn JM, Mayer TG, et al. Reliability of a visual analog version of the QuickDASH. J Bone Joint Surg Am 2006;88: 1782–7. [11] Gatchel RJ, Mayer TG, Theodore BR. The pain disability questionnaire: relationship to one-year functional and psychosocial rehabilitation outcomes. J Occup Rehabil 2006;16:75–94. [12] Anagnostis C, Gatchel RJ, Mayer TG. The pain disability questionnaire: a new psychometrically sound measure for chronic musculoskeletal disorders. Spine 2004;29:2290–302. [13] Matheson L, Mooney V, Leggett S, et al. Multidimensional task ability profile. San Diego, CA: MindTrust, 2003. [14] Mayer J, Mooney V, Matheson L, et al. The reliability and validity of a new computerized pictorial activity and task sort. J Occup Rehabil 2005;15:185–95. [15] Eisler H. Subjective scale of force for a large muscle group. J Exp Psychol 1962;64:253–7. [16] Legg S, Myles W. Maximum acceptable repetitive lifting workloads for an 8-hour work-day using psychophysical and subjective rating methods. Ergonomics 1981;24:907–16. [17] Deyo RA. Measuring the functional status of patients with low back pain. Arch Phys Med Rehabil 1988;69:1044–53. [18] Michel A, Kohlmann T, Raspe H. The association between clinical findings on physical examination and self-reported severity in back pain. Results of a population-based study. Spine 1997;22: 296–303. [19] Marras WS, Lewis KE, Ferguson SA, Parnianpour M. Impairment magnification during dynamic trunk motions. Spine 2000;25:587–95. [20] Garg A, Saxena U. Effects of lifting frequency and technique on physical fatigue with special reference to psychophysical methodology and metabolic rate. Am Ind Hyg Assoc J 1979;49:894–903. [21] Borg G. Psychophysical bases of perceived exertion. Med Sci Sports Exerc 1982;14:377–81. [22] Snook S. Psychophysical considerations in permissible loads. Ergonomics 1985;28:327–30.

[23] Jiang B, Smith J, Ayoub M. Psychophysical modeling of manual materials-handling capacities using isoinertial strength variables. Hum Factors 1986;28:671–702. [24] Khalil T, Goldberg M, Asfour S, et al. Acceptable maximum effort. A psychophysical measure of strength in back pain patients. Spine 1987;12:372–6. [25] Mayer TG, Barnes D, Nichols G, et al. Progressive isoinertial lifting evaluation. II. A comparison with isokinetic lifting in a disabled chronic low-back pain industrial population. Spine 1988;3:998–1002. [26] Borg G. Psychophysical scaling with applications in physical work and the perception of exertion. Scand J Work Environ Health 1990;16:55–8. [27] Karwowski W, Yates J, Pongpatana N. Discriminability of load heaviness: Implications for the psychophysical approach to manual lifting. Ergonomics 1992;35:729–44. [28] Snook S. Psychophysical assessments of material handling efforts. In: Proceedings of the 12th Triennial Congress of the International Ergonomics Association. Hopkinton, MA: Human Factors Association of Canada, 1994:274–6. [29] Matheson L, Mooney V, Grant J, et al. A test to measure lift capacity of physically impaired adults. Part 1 Development and reliability testing. Spine 1995;20:2119–29. [30] Matheson L, Mooney V, Holmes D, et al. A test to measure lift capacity of physically impaired adults. Part 2 Reactivity in a patient sample. Spine 1995;20:2130–4. [31] Matheson L. EPIC Lift Capacity Test examiner’s manual. Fort Bragg, CA: Work Evaluation Systems Technology, 1994. [32] Brodman K, Erdmann A, Wolff H. Cornell Medical Index Health Questionnaire manual. New York, NY: Cornell University Medical College, 1949. [33] Matheson L. Relationships among age, body weight, resting heart rate, and performance in a new test of lift capacity. J Occup Rehabil 1996;6:225–37. [34] Jay M, Lamb J, Watson R, et al. Sensitivity and specificity of the indicators of sincere effort of the EPIC Lift Capacity test on a previously injured population. Spine 2000;25:1405–12. [35] Rasch G. Probabilistic models for some intelligence and the attainment tests. Copenhagen, Denmark: Danmarks Paedagogiske Institut, 1960. [36] Andrich D. Rasch models of measurement. Newbury Park, CA: Sage, 1988. [37] Matheson L, Mayer J, Mooney V, et al. A method to provide a more efficient and reliable measure of self-report physical work capacity for patients with spinal pain. J Occup Rehabil 2008;18:46–57. [38] Mayer J, Mooney V, Matheson L, et al. Continuous low-level heat wrap therapy for the prevention and treatment of delayed onset muscle soreness of the low back muscles. Arch Phys Med Rehabil 2006;87:1310–7. [39] Cronbach L. Dependability of behavioral measurements: theory of generalizability for scores and profiles. New York, NY: John Wiley & Sons, 1972. [40] U.S. Department of Labor. The revised handbook for analyzing jobs. Washington, DC: U.S. Department of Labor, 1991.