Challenging the norm: further psychometric investigation of the neck disability index

Challenging the norm: further psychometric investigation of the neck disability index

Accepted Manuscript Challenging the Norm: Further Psychometric Investigation of the Neck Disability Index Man Hung, PhD Christine Cheng, Shirley D. Ho...

140KB Sizes 3 Downloads 50 Views

Accepted Manuscript Challenging the Norm: Further Psychometric Investigation of the Neck Disability Index Man Hung, PhD Christine Cheng, Shirley D. Hon, Jeremy D. Franklin, MA Brandon D. Lawrence, MD Ashley Neese, BS Chase B. Grover, Darrel S. Brodke, MD PII:

S1529-9430(14)00301-5

DOI:

10.1016/j.spinee.2014.03.027

Reference:

SPINEE 55822

To appear in:

The Spine Journal

Received Date: 11 July 2013 Revised Date:

12 February 2014

Accepted Date: 16 March 2014

Please cite this article as: Hung M, Cheng C, Hon SD, Franklin JD, Lawrence BD, Neese A, Grover CB, Brodke DS, Challenging the Norm: Further Psychometric Investigation of the Neck Disability Index, The Spine Journal (2014), doi: 10.1016/j.spinee.2014.03.027. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Man Hung, PhD Assistant Professor University of Utah School of Medicine Huntsman Cancer Institute Christine Cheng University of Utah School of Medicine

SC

Shirley D. Hon University of Utah College of Engineering

RI PT

Challenging the Norm: Further Psychometric Investigation of the Neck Disability Index

M AN U

Jeremy D. Franklin, MA University of Utah College of Education Brandon D. Lawrence, MD Assistant Professor University of Utah School of Medicine Ashley Neese, BS University of Utah School of Medicine

TE D

Chase B. Grover University of Utah School of Medicine

AC C

EP

Darrel S. Brodke, MD Professor University of Utah School of Medicine

Corresponding Author:

Man Hung, PhD Assistant Professor 590 Wakara Way, Salt Lake City, UT. 84108, USA Department of Orthopaedic Surgery Operations University of Utah School of Medicine Email: [email protected] Phone: 801-587-5372 Fax: 801-587-5411

ACCEPTED MANUSCRIPT 1 1

Challenging the Norm: Further Psychometric Investigation of the Neck Disability Index

2 3 ABSTRACT

RI PT

4 5

BACKGROUND CONTEXT The Neck Disability Index (NDI) was the first patient-reported

7

outcome (PRO) instrument specific to patients with neck pain and it remains one of the most

8

widely used PROs for the neck population. The NDI is an appealing measure as it is a short and a

9

well-known PRO measure. Currently, there are conflicting data on the performance and

M AN U

SC

6

10

applicability of the NDI in patients undergoing either operative or non-operative treatment for

11

neck related conditions.

12

PURPOSE This study investigates the psychometric properties, performance, and applicability

14

of the NDI in the spine patient population.

TE D

13

15

STUDY DESIGN A total of 865 patients visiting a university-based spine clinic with neck

17

complaints, with or without radiating upper extremity pain, numbness or weakness were enrolled

18

in the study. Visit types included new and follow-up visits to both operative and non-operative

19

treatments. Questionnaires were administered electronically on a tablet computer and all patients

20

answered all 10 questions of the NDI.

AC C

21

EP

16

22

METHODS Standard descriptive statistics were performed to describe the demographic

23

characteristics of the patients. Rasch modeling was applied to examine the psychometric

ACCEPTED MANUSCRIPT 2 1

properties of the NDI.

2 RESULTS The NDI demonstrated insufficient unidimensionality (i.e., unexplained variance

4

after accounting for the first dimension = 9.4%). Person reliability was 0.85 and item reliability

5

was 1.00 for the NDI. The overall item fit for the NDI was good with an outfit mean square of

6

1.03. The NDI had a floor effect of 35.5% and ceiling effect of 4.6%. The raw score to measure

7

correlation of the NDI was 0.019.

SC

RI PT

3

8

CONCLUSIONS Although the NDI had good person reliability and item reliability, it did not

10

demonstrate strong evidence of unidimensionality. The NDI exhibited a very large floor effect.

11

Due to the poor raw score to measure correlation, the sum score should not be used in

12

interpretation of findings. Despite great investment by physicians and other stakeholders in the

13

NDI, this evaluation and previous research had demonstrated that the NDI needs further

14

investigation and refinement.

15

18 19 20 21 22 23

AC C

17

EP

16

TE D

M AN U

9

Keywords: NDI; spine; patient-reported outcomes; measurement; Rasch; orthopaedics.

ACCEPTED MANUSCRIPT 3 1

INTRODUCTION

2 In 2006, nearly 22 million people in the United States sought treatment for spinal disorders (1).

4

Physicians have stressed the importance of more robust measures to assess the condition of

5

patients with spine disorders. Traditionally, clinicians have relied on technology and clinical

6

measures to assess patients. Recently, the perspective of the patient or how they feel has become

7

a major interest of health care systems, organizations, and physicians themselves. As a result,

8

physicians are taking into account patient-reported outcome measures (PRO) when assessing

9

treatment success (2, 3). As in other fields, instrument use varies from physician to physician.

M AN U

SC

RI PT

3

Furthermore, it is not clear which measures most accurately evaluate treatment effects for

11

patients with spinal disorders. In order to understand the comprehensive condition of their

12

patients, physicians need to incorporate valid and reliable measures. In conjunction with clinical

13

measures, the perceptions and perspectives of the patient need to be clearly understood in order

14

to identify appropriate treatments.

TE D

10

15

The Neck Disability Index (NDI), also known as the Vernon Mior Disability Index, was first

17

published in 1991, and is comprised of 10 items \ (See Appendix 1) (4, 5). As one of the first

18

PRO instruments specific to patients with neck pain., it remainsthe most widely used PROs for

19

patients with neck disorders and has been previously tested for validity and reliability (6, 7). It

20

has been shown to have construct validity and reasonable test-retest reliability (6). It is also

21

assumed to be a unidimensional scale (8, 9), although past research has disputed this (10). The

22

NDI instrument is appealing to physicians and patients because it isrelatively easy and quick to

23

administer.. The NDI is utilized throughout the world and has been translated into 19 languages

AC C

EP

16

ACCEPTED MANUSCRIPT 4 (11). Although it has been demonstrated to be valid, some research had raised questions about

2

the performance and applicability of the NDI in patients undergoing either operative or non-

3

operative treatment for their neck related condition. Furthermore, studies have investigated the

4

psychometric properties of the NDI and have found problems with ceiling effects, floor effects

5

and dimensionality (10-14). A systematic review of published studies on the NDI found

6

evidence of contrasting measurement properties of the NDI (11). While the majority of research

7

examining the NDI has used classical test theory (5, 8, 11), this study seeks to utilize

8

contemporary testing techniques, such as Rasch analysis, to further investigate the psychometric

9

properties, performance, and applicability of the NDI in the cervical spine patient population.

M AN U

SC

RI PT

1

10 11

METHODS

12 Data Collection

14

A total of 865 patients visiting a university-based spine clinic from June 2011 to May 2013 were

15

asked to complete the NDI, a demographic questionnaire, and some outcome questions. Patient

16

visits were due to primary neck complaints, with or without radiating upper extremity pain,

17

numbness or weakness. Both new and follow-up patients with operative and non-operative

18

treatments were included in the final analysis. Questionnaires were administered electronically

19

on a tablet computer (iPad™, Apple Inc., Cupertino, CA) prior to seeing the physician. The

20

response rate was 100%as taking it is a standard of care measurement in this specific clinic.

21

Patients under 18 years old and non-English speakers were excluded.

22

AC C

EP

TE D

13

ACCEPTED MANUSCRIPT 5 1

This was a self-funded study and none of the participants received any compensation. Informed

2

consent was not necessary since responding to the NDI was part of standard care in the clinic.

3

However, Institutional Review Board approval was obtained prior to data analysis.

RI PT

4 Analytic approach

6

Descriptive statistics were conducted to examine the characteristics of the participants. Next, a

7

Rasch item response theory (IRT) model was utilized to assess the psychometric properties of the

8

NDI including the fit, dimensionality, reliability, coverage, and raw score to measure correlation.

9

The Rasch Partial Credit Model (PCM) for polytomous data was performed in this study using

M AN U

SC

5

the Winsteps software version 3.80.0. (15) Winsteps implements the PCM with the Joint

11

Maximum Likelihood Estimation method, also known as the Unconditional Maximum

12

Likelihood Estimation, which allows estimation of both the item difficulties and person abilities

13

simultaneously and does not assume any person distribution. (16-18) The PCM utilized in

14

Winsteps models each item with its own structure using the following formulation: log (Pnij /

15

Pni(j-1) ) = Bn - Di -Fij, where Pnj is the probability that person n responding to item I is observed

16

in response category j, Bn is the ability measure of person n, Di is the item difficulty measure of

17

item i, Fij is the calibration measure of category j relative to response category j-1 for item i (15).

18

The Rasch model assumes that item difficulty is the main characteristics affecting person

19

responses. Additionally, it produces person and item estimates along a logit scale, which

20

represents a unit interval scale.

21

The Rasch model represents an attractive approach for constructing instruments for

22

measurement. Specifically, it enables both the items and the persons to be measured on the same

23

metric, allowing for meaningful comparison of scores. It also enables “transformation on the

AC C

EP

TE D

10

ACCEPTED MANUSCRIPT 6 item and person data to convert the ordinal data to yield interval data.” p. 29 (19) Prior to

2

employing the Rasch model to evaluate measurement properties, fit of the data to the model must

3

be established. (20, 21) For details into how a Rasch model works and technical terms as

4

defined by Rasch methodology, readers are encouraged to review references. (10, 20-24)

RI PT

1

5 Fit

7

To evaluate the psychometric properties of the NDI we investigated the fit of the data to a Rasch

8

model. To indicate whether the data fit the Rasch model, we reported the outfit Mean Square

9

(MNSQ) statistic. We did not report infit MNSQ since infit MNSQ, as oppose to out MNSQ,

M AN U

SC

6

does not take the entire range of persons and items into account and has demonstrated to be

11

sample size dependent. (25) Fit is important because without adequate fit the Rasch model we

12

derive may not be applicable for the data. Data is considered to fit the model if the outfit MNSQ

13

is less than 2. (20, 21, 26, 27) Although MNSQ in between 1.5 and 2.0 generally implies the

14

model is unproductive for measurement construction, it does not distort measurement

15

nevertheless.

TE D

10

EP

16 Dimensionality

18

A scale is considered to be unidimensional when it measures one underlying construct or type of

19

phenomenon, such as neck disability. If an instrument measures more than one underlying

20

construct (e.g. physical function, depression, attitude toward physician, etc.), then it is

21

multidimensional. As a result, a single summary score can be difficult to interpret because it is

22

unclear how to attribute proportions of the score to each of the multiple constructs. To evaluate

23

the unidimensionality of the NDI, a Rasch IRT model was used. (23) We evaluated the

AC C

17

ACCEPTED MANUSCRIPT 7 interrelationship of items within the NDI. If unexplained (residual) variance or noise is 5% or

2

less after taking account of the first dimension, the NDI is considered to represent a single

3

underlying concept (21). Unlike variance decomposition in classical test theory, the variance

4

components in Rasch model consists of those explained by the items as well as those explained

5

by the persons. The total unexplained (=residual) variance includes Rasch predicted randomness

6

and unexplained variance from the first dimension, second dimension, third dimension, etc.

7

Reliability

8

Additionally, we evaluated the internal consistency reliability of the NDI. In doing so, we

9

investigated person and item reliabilities. Person reliability is a measure of how reproducible the

M AN U

SC

RI PT

1

ordering of patients is across items of the instrument. Item reliability is a measure of how

11

reproducible item ordering is across persons. Internal consistency is written as an r coefficient

12

that is between 0 to 1, where 0 indicates the least consistency and 1 indicates the most

13

consistency. (28).

14

Coverage

15

The coverage of an instrument is important when assessing its usefulness because instruments

16

should be applicable to all patients in the population of interest that have varying degrees of that

17

target condition. If it does not have good coverage it may not be able to distinguish among a

18

portion of the patient population. (29) If an instrument can measure the entire patient population

19

of interest then it is said to have very good coverage. (26) Ceiling effects and floor effects can

20

be used to assess instrument coverage. If an instrument is intended to measure disability, but is

21

not able to differentiate between patients with severe disability then it is said to have a ceiling

22

effect, that is, many persons will score at highest possible value. If patients are high functioning,

23

but the instrument is not sensitive enough to assess such granular nuances between their low

AC C

EP

TE D

10

ACCEPTED MANUSCRIPT 8 1

levels of disability, then the instrument is said to have floor effect. If the entire patient

2

population is well targeted by the set of questions with minimal ceiling and floor effects, then the

3

instrument is considered to have excellent coverage.

RI PT

4 Raw Score to Measure Correlation

6

Deriving a summary score by summing the values of all the items may not yield an interval scale

7

score, that is, one that can be used meaningfully in arithmetic operations. Without this property,

8

parametric statistical analyses (mean, standard deviation, t-test, ANOVA) are inappropriate. One

9

major advantage of Rasch modeling is that the Rasch model produces a summary of the items

10

that is an interval scale score. If the correlation between the raw score and the Rasch derived

11

measure is very high, it would mean that the instrument’s raw score behaves similarly to an

12

interval scale score and it would be appropriate to perform standard parametric analyses with the

13

raw score. On the other hand, it would be inappropriate to utilize the raw score if thecorrelation

14

is very low as it would signify that the raw score has little or no resemblance to an interval scale

15

score.

TE D

M AN U

SC

5

EP

16

The raw score for NDI is the total score (i.e., the sum score from the ordinal response score with

18

a maximum of 50 multiplied by 2 to get a percentage. The correlation between the raw score and

19

the measure should be high (e.g., r = 0.7 or higher) for the raw scores to be meaningful and

20

useful in parametric statistical analyses.

21

AC C

17

22 23

RESULTS

ACCEPTED MANUSCRIPT 9 1 Demographics

3

A total of 865 patients made up the final sample with 57.5% male (Table 1). The majority of

4

participants were white (94%) and the mean age was 55 years old (ranging from 15 to 92 years

5

old). For return visit (i.e., follow-up visit) patients, about 49% stated that their pain/symptoms

6

were somewhat better or markedly better since their last visit. Close to half of the respondents

7

(46%) stated that they had two or more problems (e.g. balance, numbness, weakness). Almost

8

35% of the respondents tried multiple treatment methods to treat their pain. About 71% of the

9

participants stated that the primary reason for their visit that day was neck problems. When

M AN U

SC

RI PT

2

asked about the percentage of their pain either being in their back and/or neck and percentage of

11

leg and/or arm pain, 5% of patients reported only arm pain and 16% reported only neck pain.

12

About 63% reported a combination of back and/neck pain and leg and/or arm pain. About 13%

13

of the participants stated their symptoms had bothered them for 1 to 3 months and about 23%

14

reported that their symptoms had bothered them more than 24 months.

TE D

10

15 Item level statistics

17

The NDI consists of a total of 10 items with its response categories ranging from 0 (least

18

disabled) to 5 (most disabled). Category 0 was selected most often (27.6% on average for all

19

items, which represents strong floor effect) and category 5 was selected least often (3.8% on

20

average for all items). The average point biserial correlations of the items were all positive

21

(ranging from 0.53 to 0.75) indicating that a high score on an individual item was related to a

22

high score for the entire instrument. The outfit MNSQ for the items ranged from 0.68 to 1.70

23

with a mean of 1.03 (std. dev. = 0.28).

AC C

EP

16

ACCEPTED MANUSCRIPT 10 1 Fit

3

The overall item fit for the NDI was very good with a mean outfit MNSQ of 1.03 indicating the

4

data fit the Rasch model very well. Due to the close fit of the data to the model, it was

5

appropriate to move on with evaluating the properties of the NDI.

6

RI PT

2

Dimensionality

8

The NDI demonstrated poor unidimensionality with unexplained variance in the first dimension

9

being 9.4%. Item 2 and 5 were largely misfit to the model with fit residual values being 4.55 and

M AN U

SC

7

4.67 respectively. Considering the NDI is supposed to measure one condition, we expected a

11

lower percentage of unexplained variance. The unexplained variance is high and this suggests

12

that the NDI may not be unidimensional in measuring neck disability. Nonetheless, this

13

departure from unidimensionality was not large enough (e.g., less than 10% unexplained

14

variance) to cause huge issues preventing further assessment and interpretation of of the persona

15

and item estimates using Rasch analysis. (30)

TE D

10

EP

16 Reliability

18

The person reliability for the NDI was 0.85. This reliability suggests that similar ordering of

19

patients’ functioning levels would occur in repeated studies. The Rasch analysis indicated that

20

item reliability for the NDI was 1.00. An item reliability this high suggests that order of item

21

difficulty would be similar regardless of patient population or neck disability ailment.

AC C

17

22 23

Coverage

ACCEPTED MANUSCRIPT 11 Figure 1 shows a person item histogram. The scores shown on figure 1 were in interval scale, not

2

ordinal scale. The top panel is the distribution of the patient’s measures and the bottom panel is

3

the distribution of the item’s measures (i.e., item difficulties). The horizontal axis represents the

4

measures (in z score scale) where the left side indicates higher disability levels and right side

5

indicate lower disability levels. The vertical axis on the top panel represents the number of

6

patients and the bottom panel represents the number of items. In the figure, there is a large region

7

to the left of the person distribution not aligned with (or targeted by) the item distribution.

SC

RI PT

1

8

The NDI had a floor effect of 35.5% and ceiling effect of 4.6%. The floor effect of the NDI is

M AN U

9 10

extremely high and the ceiling effect is acceptable. This implies that the items of the NDI did not

11

target patients with lower levels of disability of their neck.

12 Raw Score to Measure Correlation

14

Figure 2 displays a scatter plot of the NDI’s raw scores on the x-axis and the Rasch derived

15

measures on the y-axis. As demonstrated by the flattened fitted line across all the measurement

16

points, there is little to no correlation between the raw scores and the measures of the NDI, The

17

raw score to measure correlation of the NDI was 0.019. This correlation is extremely low,

18

implying that the sum score (raw score) is not interval scaled.

20 21 22

EP

AC C

19

TE D

13

DISCUSSION

ACCEPTED MANUSCRIPT 12 Prior to conducting this study we were aware that that NDI was one of the most widely used

2

instruments to assess neck pain. A majority of the research we reviewed on the NDI found that it

3

was a reliable and valid instrument, but there are reported problems with ceiling effects, floor

4

effects, and lack of support for unidimensionality (10). Given the conflicting nature of the

5

current literature and ubiquitous use of the NDI, we felt the NDI required expanded

6

investigation. Our findings confirm previous critiques of the NDI, in that there are serious floor

7

effects and the instrument lacks strong unidimensionality.

SC

8

RI PT

1

Although the NDI exhibited good person and item reliabilities, it demonstrated marginal ceiling

10

effects and very poor floor effects. In this study, ceiling effects were 4.6% and floor effects were

11

35.5%, rates that are generally considered unacceptable. While this was not totally unexpected

12

given previous findings, it was still surprising considering the NDI is so commonly used to

13

assess patients with varying levels of disability. As a result, some patients may not be properly

14

assessed with the NDI. The high floor effect indicates that the NDI is a very poor instrument for

15

measuring patients who are moderate to high functioning and may not be sensitive to change in

16

patients’ functioning over time.

TE D

EP

17

M AN U

9

The raw score to measure correlation was poor, indicating that summing of the raw score is not

19

acceptable or meaningful. The commonly used NDI raw score is not linear and should likely not

20

be used in interpretation of findings. Proper transformation of the NDI raw score into a linear

21

measure is needed prior to statistical analysis and interpretation, since most statistical procedures

22

(e.g., t-test, ANOVA, Pearson correlation) assume that the scores used for conducting statistical

23

analyses are linear measures.

AC C

18

ACCEPTED MANUSCRIPT 13 1 This analysis confirms previous research and adds to the concerns of NDI usage as a PRO

3

measure. Unfortunately, the NDI does not carry with it a clear interpretation of what a score

4

means. The NDI does not exhibit strong unidimensionality, has serious limitations in regards to

5

its floor effects, and has an unacceptable raw score to measure correlation.

RI PT

2

6 Limitations and Future Research

8

This study exhibits some threats to external validity in that it cannot necessarily be generalizable.

9

The sample consisted of participants that actively chose to visit a university-based spine clinic.

M AN U

SC

7

Therefore, this study has a very targeted sample that has the means and time to visit a clinic.

11

Furthermore, the study only took place in one clinic. The majority of the sample is White and

12

female. As a result, the findings may not be generalizable to other racial/ethnic groups. Future

13

research needs to incorporate a more diverse sample. Finally, future research might look into

14

modifications of the NDI or identify other neck disability instruments that can perform better and

15

to be incorporated into standard of care measurement. As a result, this will allow for more

16

accurate and precise measurement of neck disability and thus, provide better care for those

17

suffering from neck disorders.

AC C

EP

TE D

10

ACCEPTED MANUSCRIPT 14 1

REFERENCES

2 1.

Martin BI, Turner JA, Mirza SK, et al. Trends in health care expenditures, utilization, and

4

health status among US adults with spine problems, 1997-2006. Spine. 2009;34(19):2077-84.

5

2.

6

surgical spine care: patient satisfaction is not a valid proxy. The spine journal : official journal of

7

the North American Spine Society. 2013.

8

3.

9

surgery. The Journal of the American Academy of Orthopaedic Surgeons. 2013;21(2):99-107.

RI PT

3

SC

Godil SS, Parker SL, Zuckerman SL, et al. Determining the quality and effectiveness of

M AN U

McCormick JD, Werner BC, Shimer AL. Patient-reported outcome measures in spine

10

4.

Vernon H. The Neck Disability Index: state-of-the-art, 1991-2008. Journal of

11

manipulative and physiological therapeutics. 2008;31(7):491-502.

12

5.

13

of manipulative and physiological therapeutics. 1991;14(7):409-15.

14

6.

15

responsiveness of the neck disability index, patient-specific functional scale, and numeric pain

16

rating scale in patients with cervical radiculopathy. American journal of physical medicine &

17

rehabilitation / Association of Academic Physiatrists. 2010;89(10):831-9.

18

7.

19

Bournemouth Questionnaire in a sample of patients with chronic uncomplicated neck pain.

20

Journal of manipulative and physiological therapeutics. 2007;30(4):259-62.

21

8.

22

the Neck Disability Index and patient specific functional scale in patients with cervical

23

radiculopathy. Spine. 2006;31(5):598-602.

TE D

Vernon H, Mior S. The Neck Disability Index: a study of reliability and validity. Journal

EP

Young IA, Cleland JA, Michener LA, Brown C. Reliability, construct validity, and

AC C

Gay RE, Madson TJ, Cieslak KR. Comparison of the Neck Disability Index and the Neck

Cleland JA, Fritz JM, Whitman JM, Palmer JA. The reliability and construct validity of

ACCEPTED MANUSCRIPT 15 1

9.

Pool JJ, Ostelo RW, Hoving JL, et al. Minimal clinically important change of the Neck

2

Disability Index and the Numerical Rating Scale for patients with neck pain. Spine.

3

2007;32(26):3047-51.

4

10.

5

into the measurement properties of the neck disability index. Arthritis and rheumatism.

6

2009;61(4):544-51.

7

11.

8

disability index: a systematic review. The Journal of orthopaedic and sports physical therapy.

9

2009;39(5):400-17.

RI PT

van der Velde G, Beaton D, Hogg-Johnston S, et al. Rasch analysis provides new insights

M AN U

SC

MacDermid JC, Walton DM, Avery S, et al. Measurement properties of the neck

10

12.

Hains F, Waalen J, Mior S. Psychometric properties of the neck disability index. Journal

11

of manipulative and physiological therapeutics. 1998;21(2):75-80.

12

13.

13

measures on patients with cervical spine disorders. Physical therapy. 1998;78(9):951-63.

14

14.

15

prerequisite for the assessment of validity. The Neck Disability Index as an example. Journal of

16

Clinical Epidemiology. 2013;66(7):775-82.e2.

17

15.

18

Chicago, IL: MESA Press.

19

16.

20

Psychometrika 58.1 (1993): 87-99.

21

17.

22

Chicago: MESA Press.

TE D

Riddle DL, Stratford PW. Use of generic versus region-specific functional status

EP

Ailliet L, Knol DL, Rubinstein SM, et al. Definition of the construct to be measured is a

AC C

Linacre, J. (2013). A User’s Guide to Winsteps Rasch-Model Computer Programs.

Wilson, M.R., and Masters, G.N. "The partial credit model and null categories"

Wright, B.D., and Masters, G.N. 1982. Rating Scale Analysis: Rasch Measurement.

ACCEPTED MANUSCRIPT 16 1

18.

Wright, B.D., and Panchapakesan, N. 1969. A procedure for sample-free item analysis,

2

Educational and Psychological Measurement, 29, 23–48.

3

19.

4

measurement in the human sciences. Mahwah, NJ, USA: Lawrence Erlbaum.

5

20.

6

sciences: Psychology Press; 2013.

7

21.

Wright BD, Masters GN. Rating Scale Analysis. Rasch Measurement: ERIC; 1982.

8

22.

De Vet HCW, Terwee CB, Mokkink LB, Knol DL. Measurement in Medicine. New

9

York: Cambridge University Press; 2011.

RI PT

Bond, T. G., and Fox, C. M. (2001). Applying the Rasch model: Fundamental

M AN U

SC

Bond TG, Fox CM. Applying the Rasch model: Fundamental measurement in the human

10

23.

Linacre JM. Understanding Rasch Measurement: Optimizing Rating Scale Category

11

Effectiveness. Journal of Applied Measurement. 2002;3(1):85-106.

12

24.

13

and use. Fourth ed. New York: Oxford University Press; 2008.

14

25.

15

evaluate fit to the rasch model. Journal of Outcome Measurement, 2(1), 66-78.

16

26.

17

assessment in foot & ankle research: computerized adaptive testing. Foot & ankle international /

18

American Orthopaedic Foot and Ankle Society [and] Swiss Foot and Ankle Society.

19

2012;33(8):621-6.

20

27.

21

measure short form (PAM-13) in rural settings. Quality of Life Research. 2013, 22(3): 521-9.

22

28.

23

Upper Saddle River, NJ: Pearson/Prentice Hall; 2005.

TE D

Streiner DL, G.R. N. Health Measurement Scales: a practical guide to their development

Smith, Richard M., Schumacker, R. E., Bush, M. J. (1998). Using item mean squares to

AC C

EP

Hung M, Nickisch F, Beals TC, et al. New paradigm for patient-reported outcomes

Hung M, Carter M, Hayden C, et al. Psychometric assessment of the patient activation

Davidshofer K, Murphy C. Psychological Testing: Principles and Applications. 6th ed.

ACCEPTED MANUSCRIPT 17 1

29.

Hung M, Baumhauer JF, Latt LD, et al. Validation of PROMIS Physical Function

2

Computerized Adaptive Tests for Orthopaedic Foot and Ankle Outcome Research. Clinical

3

orthopaedics and related research. 2013, 471(11): 3466-74.

4

30.

5

statistics and principal component analysis of residuals. Journal of Applied Measurement, 2002,

6

3:205–231.

RI PT

Smith E. V. Jr. Detecting and evaluating the impact of multidimensionality using item fit

SC

7

9

FIGURE LEGEND

M AN U

8

Table 1. Demographic characteristics (N=865).

11

Figure 1. Person item histogram.

12

Figure 2. NDI raw score to measure correlation.

13

Appendix 1.Neck Disability Index.

AC C

EP

TE D

10

ACCEPTED MANUSCRIPT

Variables Age

Min 15.4

Max 91.6

Mean (SD) 55.2 (15.9)

n

Percent

497 368

57.5 42.5

Gender

RI PT

Male Female

Symptoms duration Less than 1 month 1 – 3 months 3 – 6 months 6 – 24 months More than 24 moths Missing

AC C

EP

TE D

Treatment since last visit Nothing Surgery Physical therapy Medications Injections Two or more treatment methods Missing

M AN U

White or Caucasian Black or African American American Indian and Alaska Native Native Hawaiian and Other Pacific Islander Asian Other Missing

784 4 10 9 7 20 31

94.0 0.5 1.2 1.1 0.8 2.4

103 143 104 150 202 163

14.7 20.4 14.8 21.4 28.8

160 39 66 166 20 243 171

23.1 5.6 9.5 23.9 2.9 35.0

SC

Race

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIPT

2

Personal Care (Washing, dressing, etc.)

3

Lifting

4

Reading

5

Headache

6

Concentration

7

Work

8

Driving

9

Sleeping

Possible Responses 0. I have no pain at the moment. 1. The pain is very mild at the moment. 2. The pain is moderate at the moment. 3. The pain is fairly severe at the moment. 4. The pain is very severe at the moment. 5. The pain is the worse imaginable at the moment. 0. I can look after myself normally without causing extra pain. 1. I can look after myself normally but it causes me extra pain. 2. It is painful to look after myself and I am slow and careful. 3. I need help but manage most of my personal care 4. I need help every day in most aspects of self-care. 5. I do not get dressed, wash with difficulty and sty in bed. 0. I can lift heavy weights without extra pain. 1. I can lift heavy weights but it gives extra pain. 2. Pain prevents me from lifting heavy weights off the floor, but I can manage if they were conveniently positioned, e.g. on a table 3. Pain prevents me from lifting heavy weights off the floor, but I can manage light to medium weights if they are conveniently positioned. 4. I can lift only very lightweights. 5. I cannot lift or carry anything at all. 0. I can read as much as I want to with no pain in my neck. 1. I can red as much as I want to with slight pain in my neck. 2. I can read as much as I want with moderate pain in my neck. 3. I can’t read as much as I want because of moderate pain in my neck. 4. I can hardly read at all because of severe pain in my neck. 5. I cannot read at all. 0. I have no headache at all. 1. I have slight headaches, which come infrequently. 2. I have moderate headaches, which come infrequently. 3. I have moderate headaches, which come frequently. 4. I have severe headaches, which come frequently. 5. I have headaches almost all the time. 0. I can concentrate fully when I want to with no difficulty. 1. I can concentrate fully when I want to with slight difficulty. 2. I have a fair degree of difficulty in concentrating when I want to. 3. I have a lot of difficulty in concentrating when I want to. 4. I have a great deal of difficulty in concentration when I want to. 5. I cannot concentrate at all. 0. I can do as much as I want. 1. I can only do my usual work but no more. 2. I can do most of usual work, but no more. 3. I cannot do my usual work. 4. I can hardly do any work at all. 5. I can’t do any work at all. 0. I can drive my car without any neck pain. 1. I can drive my car as long as I want with slight pain in my neck. 2. I can drive my car as long as I want with moderate pain in my neck. 3. I can’t drive my car as long as I want because of moderate pain in my neck. 4. I can hardly drive at all because of severe pain in my neck. 5. I can’t drive my car at all. 0. I have no trouble sleeping. 1. My sleep is slightly disturbed (less than 1 hour sleep loss). 2. My sleep is mildly disturbed (1-2 hour sleep loss.)

RI PT

Everyday Activity Pain Intensity

AC C

EP

TE D

M AN U

SC

Section 1

ACCEPTED MANUSCRIPT

2. 3.

EP

TE D

M AN U

SC

4. 5.

My sleep is moderately disturbed (2-3 hours sleep loss). My sleep is greatly disturbed (3-5 hours sleep loss). My sleep is completely disturbed (5-7 hours sleep loss). I am able to engage in all my recreational activities with no neck pain at all. I am able to engage in all my recreational activities with some pain in my neck. I am able to engage in most but not all of my usual recreational activities because of pain in my neck. I am able to engage in a few of my usual recreational activities because of pain in my neck. I can hardly do any recreational activities because of pain in my neck. I can’t do any recreational activates at all.

RI PT

Recreation

AC C

10

3. 4. 5. 0. 1.