Accepted Manuscript Challenging the Norm: Further Psychometric Investigation of the Neck Disability Index Man Hung, PhD Christine Cheng, Shirley D. Hon, Jeremy D. Franklin, MA Brandon D. Lawrence, MD Ashley Neese, BS Chase B. Grover, Darrel S. Brodke, MD PII:
S1529-9430(14)00301-5
DOI:
10.1016/j.spinee.2014.03.027
Reference:
SPINEE 55822
To appear in:
The Spine Journal
Received Date: 11 July 2013 Revised Date:
12 February 2014
Accepted Date: 16 March 2014
Please cite this article as: Hung M, Cheng C, Hon SD, Franklin JD, Lawrence BD, Neese A, Grover CB, Brodke DS, Challenging the Norm: Further Psychometric Investigation of the Neck Disability Index, The Spine Journal (2014), doi: 10.1016/j.spinee.2014.03.027. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Man Hung, PhD Assistant Professor University of Utah School of Medicine Huntsman Cancer Institute Christine Cheng University of Utah School of Medicine
SC
Shirley D. Hon University of Utah College of Engineering
RI PT
Challenging the Norm: Further Psychometric Investigation of the Neck Disability Index
M AN U
Jeremy D. Franklin, MA University of Utah College of Education Brandon D. Lawrence, MD Assistant Professor University of Utah School of Medicine Ashley Neese, BS University of Utah School of Medicine
TE D
Chase B. Grover University of Utah School of Medicine
AC C
EP
Darrel S. Brodke, MD Professor University of Utah School of Medicine
Corresponding Author:
Man Hung, PhD Assistant Professor 590 Wakara Way, Salt Lake City, UT. 84108, USA Department of Orthopaedic Surgery Operations University of Utah School of Medicine Email:
[email protected] Phone: 801-587-5372 Fax: 801-587-5411
ACCEPTED MANUSCRIPT 1 1
Challenging the Norm: Further Psychometric Investigation of the Neck Disability Index
2 3 ABSTRACT
RI PT
4 5
BACKGROUND CONTEXT The Neck Disability Index (NDI) was the first patient-reported
7
outcome (PRO) instrument specific to patients with neck pain and it remains one of the most
8
widely used PROs for the neck population. The NDI is an appealing measure as it is a short and a
9
well-known PRO measure. Currently, there are conflicting data on the performance and
M AN U
SC
6
10
applicability of the NDI in patients undergoing either operative or non-operative treatment for
11
neck related conditions.
12
PURPOSE This study investigates the psychometric properties, performance, and applicability
14
of the NDI in the spine patient population.
TE D
13
15
STUDY DESIGN A total of 865 patients visiting a university-based spine clinic with neck
17
complaints, with or without radiating upper extremity pain, numbness or weakness were enrolled
18
in the study. Visit types included new and follow-up visits to both operative and non-operative
19
treatments. Questionnaires were administered electronically on a tablet computer and all patients
20
answered all 10 questions of the NDI.
AC C
21
EP
16
22
METHODS Standard descriptive statistics were performed to describe the demographic
23
characteristics of the patients. Rasch modeling was applied to examine the psychometric
ACCEPTED MANUSCRIPT 2 1
properties of the NDI.
2 RESULTS The NDI demonstrated insufficient unidimensionality (i.e., unexplained variance
4
after accounting for the first dimension = 9.4%). Person reliability was 0.85 and item reliability
5
was 1.00 for the NDI. The overall item fit for the NDI was good with an outfit mean square of
6
1.03. The NDI had a floor effect of 35.5% and ceiling effect of 4.6%. The raw score to measure
7
correlation of the NDI was 0.019.
SC
RI PT
3
8
CONCLUSIONS Although the NDI had good person reliability and item reliability, it did not
10
demonstrate strong evidence of unidimensionality. The NDI exhibited a very large floor effect.
11
Due to the poor raw score to measure correlation, the sum score should not be used in
12
interpretation of findings. Despite great investment by physicians and other stakeholders in the
13
NDI, this evaluation and previous research had demonstrated that the NDI needs further
14
investigation and refinement.
15
18 19 20 21 22 23
AC C
17
EP
16
TE D
M AN U
9
Keywords: NDI; spine; patient-reported outcomes; measurement; Rasch; orthopaedics.
ACCEPTED MANUSCRIPT 3 1
INTRODUCTION
2 In 2006, nearly 22 million people in the United States sought treatment for spinal disorders (1).
4
Physicians have stressed the importance of more robust measures to assess the condition of
5
patients with spine disorders. Traditionally, clinicians have relied on technology and clinical
6
measures to assess patients. Recently, the perspective of the patient or how they feel has become
7
a major interest of health care systems, organizations, and physicians themselves. As a result,
8
physicians are taking into account patient-reported outcome measures (PRO) when assessing
9
treatment success (2, 3). As in other fields, instrument use varies from physician to physician.
M AN U
SC
RI PT
3
Furthermore, it is not clear which measures most accurately evaluate treatment effects for
11
patients with spinal disorders. In order to understand the comprehensive condition of their
12
patients, physicians need to incorporate valid and reliable measures. In conjunction with clinical
13
measures, the perceptions and perspectives of the patient need to be clearly understood in order
14
to identify appropriate treatments.
TE D
10
15
The Neck Disability Index (NDI), also known as the Vernon Mior Disability Index, was first
17
published in 1991, and is comprised of 10 items \ (See Appendix 1) (4, 5). As one of the first
18
PRO instruments specific to patients with neck pain., it remainsthe most widely used PROs for
19
patients with neck disorders and has been previously tested for validity and reliability (6, 7). It
20
has been shown to have construct validity and reasonable test-retest reliability (6). It is also
21
assumed to be a unidimensional scale (8, 9), although past research has disputed this (10). The
22
NDI instrument is appealing to physicians and patients because it isrelatively easy and quick to
23
administer.. The NDI is utilized throughout the world and has been translated into 19 languages
AC C
EP
16
ACCEPTED MANUSCRIPT 4 (11). Although it has been demonstrated to be valid, some research had raised questions about
2
the performance and applicability of the NDI in patients undergoing either operative or non-
3
operative treatment for their neck related condition. Furthermore, studies have investigated the
4
psychometric properties of the NDI and have found problems with ceiling effects, floor effects
5
and dimensionality (10-14). A systematic review of published studies on the NDI found
6
evidence of contrasting measurement properties of the NDI (11). While the majority of research
7
examining the NDI has used classical test theory (5, 8, 11), this study seeks to utilize
8
contemporary testing techniques, such as Rasch analysis, to further investigate the psychometric
9
properties, performance, and applicability of the NDI in the cervical spine patient population.
M AN U
SC
RI PT
1
10 11
METHODS
12 Data Collection
14
A total of 865 patients visiting a university-based spine clinic from June 2011 to May 2013 were
15
asked to complete the NDI, a demographic questionnaire, and some outcome questions. Patient
16
visits were due to primary neck complaints, with or without radiating upper extremity pain,
17
numbness or weakness. Both new and follow-up patients with operative and non-operative
18
treatments were included in the final analysis. Questionnaires were administered electronically
19
on a tablet computer (iPad™, Apple Inc., Cupertino, CA) prior to seeing the physician. The
20
response rate was 100%as taking it is a standard of care measurement in this specific clinic.
21
Patients under 18 years old and non-English speakers were excluded.
22
AC C
EP
TE D
13
ACCEPTED MANUSCRIPT 5 1
This was a self-funded study and none of the participants received any compensation. Informed
2
consent was not necessary since responding to the NDI was part of standard care in the clinic.
3
However, Institutional Review Board approval was obtained prior to data analysis.
RI PT
4 Analytic approach
6
Descriptive statistics were conducted to examine the characteristics of the participants. Next, a
7
Rasch item response theory (IRT) model was utilized to assess the psychometric properties of the
8
NDI including the fit, dimensionality, reliability, coverage, and raw score to measure correlation.
9
The Rasch Partial Credit Model (PCM) for polytomous data was performed in this study using
M AN U
SC
5
the Winsteps software version 3.80.0. (15) Winsteps implements the PCM with the Joint
11
Maximum Likelihood Estimation method, also known as the Unconditional Maximum
12
Likelihood Estimation, which allows estimation of both the item difficulties and person abilities
13
simultaneously and does not assume any person distribution. (16-18) The PCM utilized in
14
Winsteps models each item with its own structure using the following formulation: log (Pnij /
15
Pni(j-1) ) = Bn - Di -Fij, where Pnj is the probability that person n responding to item I is observed
16
in response category j, Bn is the ability measure of person n, Di is the item difficulty measure of
17
item i, Fij is the calibration measure of category j relative to response category j-1 for item i (15).
18
The Rasch model assumes that item difficulty is the main characteristics affecting person
19
responses. Additionally, it produces person and item estimates along a logit scale, which
20
represents a unit interval scale.
21
The Rasch model represents an attractive approach for constructing instruments for
22
measurement. Specifically, it enables both the items and the persons to be measured on the same
23
metric, allowing for meaningful comparison of scores. It also enables “transformation on the
AC C
EP
TE D
10
ACCEPTED MANUSCRIPT 6 item and person data to convert the ordinal data to yield interval data.” p. 29 (19) Prior to
2
employing the Rasch model to evaluate measurement properties, fit of the data to the model must
3
be established. (20, 21) For details into how a Rasch model works and technical terms as
4
defined by Rasch methodology, readers are encouraged to review references. (10, 20-24)
RI PT
1
5 Fit
7
To evaluate the psychometric properties of the NDI we investigated the fit of the data to a Rasch
8
model. To indicate whether the data fit the Rasch model, we reported the outfit Mean Square
9
(MNSQ) statistic. We did not report infit MNSQ since infit MNSQ, as oppose to out MNSQ,
M AN U
SC
6
does not take the entire range of persons and items into account and has demonstrated to be
11
sample size dependent. (25) Fit is important because without adequate fit the Rasch model we
12
derive may not be applicable for the data. Data is considered to fit the model if the outfit MNSQ
13
is less than 2. (20, 21, 26, 27) Although MNSQ in between 1.5 and 2.0 generally implies the
14
model is unproductive for measurement construction, it does not distort measurement
15
nevertheless.
TE D
10
EP
16 Dimensionality
18
A scale is considered to be unidimensional when it measures one underlying construct or type of
19
phenomenon, such as neck disability. If an instrument measures more than one underlying
20
construct (e.g. physical function, depression, attitude toward physician, etc.), then it is
21
multidimensional. As a result, a single summary score can be difficult to interpret because it is
22
unclear how to attribute proportions of the score to each of the multiple constructs. To evaluate
23
the unidimensionality of the NDI, a Rasch IRT model was used. (23) We evaluated the
AC C
17
ACCEPTED MANUSCRIPT 7 interrelationship of items within the NDI. If unexplained (residual) variance or noise is 5% or
2
less after taking account of the first dimension, the NDI is considered to represent a single
3
underlying concept (21). Unlike variance decomposition in classical test theory, the variance
4
components in Rasch model consists of those explained by the items as well as those explained
5
by the persons. The total unexplained (=residual) variance includes Rasch predicted randomness
6
and unexplained variance from the first dimension, second dimension, third dimension, etc.
7
Reliability
8
Additionally, we evaluated the internal consistency reliability of the NDI. In doing so, we
9
investigated person and item reliabilities. Person reliability is a measure of how reproducible the
M AN U
SC
RI PT
1
ordering of patients is across items of the instrument. Item reliability is a measure of how
11
reproducible item ordering is across persons. Internal consistency is written as an r coefficient
12
that is between 0 to 1, where 0 indicates the least consistency and 1 indicates the most
13
consistency. (28).
14
Coverage
15
The coverage of an instrument is important when assessing its usefulness because instruments
16
should be applicable to all patients in the population of interest that have varying degrees of that
17
target condition. If it does not have good coverage it may not be able to distinguish among a
18
portion of the patient population. (29) If an instrument can measure the entire patient population
19
of interest then it is said to have very good coverage. (26) Ceiling effects and floor effects can
20
be used to assess instrument coverage. If an instrument is intended to measure disability, but is
21
not able to differentiate between patients with severe disability then it is said to have a ceiling
22
effect, that is, many persons will score at highest possible value. If patients are high functioning,
23
but the instrument is not sensitive enough to assess such granular nuances between their low
AC C
EP
TE D
10
ACCEPTED MANUSCRIPT 8 1
levels of disability, then the instrument is said to have floor effect. If the entire patient
2
population is well targeted by the set of questions with minimal ceiling and floor effects, then the
3
instrument is considered to have excellent coverage.
RI PT
4 Raw Score to Measure Correlation
6
Deriving a summary score by summing the values of all the items may not yield an interval scale
7
score, that is, one that can be used meaningfully in arithmetic operations. Without this property,
8
parametric statistical analyses (mean, standard deviation, t-test, ANOVA) are inappropriate. One
9
major advantage of Rasch modeling is that the Rasch model produces a summary of the items
10
that is an interval scale score. If the correlation between the raw score and the Rasch derived
11
measure is very high, it would mean that the instrument’s raw score behaves similarly to an
12
interval scale score and it would be appropriate to perform standard parametric analyses with the
13
raw score. On the other hand, it would be inappropriate to utilize the raw score if thecorrelation
14
is very low as it would signify that the raw score has little or no resemblance to an interval scale
15
score.
TE D
M AN U
SC
5
EP
16
The raw score for NDI is the total score (i.e., the sum score from the ordinal response score with
18
a maximum of 50 multiplied by 2 to get a percentage. The correlation between the raw score and
19
the measure should be high (e.g., r = 0.7 or higher) for the raw scores to be meaningful and
20
useful in parametric statistical analyses.
21
AC C
17
22 23
RESULTS
ACCEPTED MANUSCRIPT 9 1 Demographics
3
A total of 865 patients made up the final sample with 57.5% male (Table 1). The majority of
4
participants were white (94%) and the mean age was 55 years old (ranging from 15 to 92 years
5
old). For return visit (i.e., follow-up visit) patients, about 49% stated that their pain/symptoms
6
were somewhat better or markedly better since their last visit. Close to half of the respondents
7
(46%) stated that they had two or more problems (e.g. balance, numbness, weakness). Almost
8
35% of the respondents tried multiple treatment methods to treat their pain. About 71% of the
9
participants stated that the primary reason for their visit that day was neck problems. When
M AN U
SC
RI PT
2
asked about the percentage of their pain either being in their back and/or neck and percentage of
11
leg and/or arm pain, 5% of patients reported only arm pain and 16% reported only neck pain.
12
About 63% reported a combination of back and/neck pain and leg and/or arm pain. About 13%
13
of the participants stated their symptoms had bothered them for 1 to 3 months and about 23%
14
reported that their symptoms had bothered them more than 24 months.
TE D
10
15 Item level statistics
17
The NDI consists of a total of 10 items with its response categories ranging from 0 (least
18
disabled) to 5 (most disabled). Category 0 was selected most often (27.6% on average for all
19
items, which represents strong floor effect) and category 5 was selected least often (3.8% on
20
average for all items). The average point biserial correlations of the items were all positive
21
(ranging from 0.53 to 0.75) indicating that a high score on an individual item was related to a
22
high score for the entire instrument. The outfit MNSQ for the items ranged from 0.68 to 1.70
23
with a mean of 1.03 (std. dev. = 0.28).
AC C
EP
16
ACCEPTED MANUSCRIPT 10 1 Fit
3
The overall item fit for the NDI was very good with a mean outfit MNSQ of 1.03 indicating the
4
data fit the Rasch model very well. Due to the close fit of the data to the model, it was
5
appropriate to move on with evaluating the properties of the NDI.
6
RI PT
2
Dimensionality
8
The NDI demonstrated poor unidimensionality with unexplained variance in the first dimension
9
being 9.4%. Item 2 and 5 were largely misfit to the model with fit residual values being 4.55 and
M AN U
SC
7
4.67 respectively. Considering the NDI is supposed to measure one condition, we expected a
11
lower percentage of unexplained variance. The unexplained variance is high and this suggests
12
that the NDI may not be unidimensional in measuring neck disability. Nonetheless, this
13
departure from unidimensionality was not large enough (e.g., less than 10% unexplained
14
variance) to cause huge issues preventing further assessment and interpretation of of the persona
15
and item estimates using Rasch analysis. (30)
TE D
10
EP
16 Reliability
18
The person reliability for the NDI was 0.85. This reliability suggests that similar ordering of
19
patients’ functioning levels would occur in repeated studies. The Rasch analysis indicated that
20
item reliability for the NDI was 1.00. An item reliability this high suggests that order of item
21
difficulty would be similar regardless of patient population or neck disability ailment.
AC C
17
22 23
Coverage
ACCEPTED MANUSCRIPT 11 Figure 1 shows a person item histogram. The scores shown on figure 1 were in interval scale, not
2
ordinal scale. The top panel is the distribution of the patient’s measures and the bottom panel is
3
the distribution of the item’s measures (i.e., item difficulties). The horizontal axis represents the
4
measures (in z score scale) where the left side indicates higher disability levels and right side
5
indicate lower disability levels. The vertical axis on the top panel represents the number of
6
patients and the bottom panel represents the number of items. In the figure, there is a large region
7
to the left of the person distribution not aligned with (or targeted by) the item distribution.
SC
RI PT
1
8
The NDI had a floor effect of 35.5% and ceiling effect of 4.6%. The floor effect of the NDI is
M AN U
9 10
extremely high and the ceiling effect is acceptable. This implies that the items of the NDI did not
11
target patients with lower levels of disability of their neck.
12 Raw Score to Measure Correlation
14
Figure 2 displays a scatter plot of the NDI’s raw scores on the x-axis and the Rasch derived
15
measures on the y-axis. As demonstrated by the flattened fitted line across all the measurement
16
points, there is little to no correlation between the raw scores and the measures of the NDI, The
17
raw score to measure correlation of the NDI was 0.019. This correlation is extremely low,
18
implying that the sum score (raw score) is not interval scaled.
20 21 22
EP
AC C
19
TE D
13
DISCUSSION
ACCEPTED MANUSCRIPT 12 Prior to conducting this study we were aware that that NDI was one of the most widely used
2
instruments to assess neck pain. A majority of the research we reviewed on the NDI found that it
3
was a reliable and valid instrument, but there are reported problems with ceiling effects, floor
4
effects, and lack of support for unidimensionality (10). Given the conflicting nature of the
5
current literature and ubiquitous use of the NDI, we felt the NDI required expanded
6
investigation. Our findings confirm previous critiques of the NDI, in that there are serious floor
7
effects and the instrument lacks strong unidimensionality.
SC
8
RI PT
1
Although the NDI exhibited good person and item reliabilities, it demonstrated marginal ceiling
10
effects and very poor floor effects. In this study, ceiling effects were 4.6% and floor effects were
11
35.5%, rates that are generally considered unacceptable. While this was not totally unexpected
12
given previous findings, it was still surprising considering the NDI is so commonly used to
13
assess patients with varying levels of disability. As a result, some patients may not be properly
14
assessed with the NDI. The high floor effect indicates that the NDI is a very poor instrument for
15
measuring patients who are moderate to high functioning and may not be sensitive to change in
16
patients’ functioning over time.
TE D
EP
17
M AN U
9
The raw score to measure correlation was poor, indicating that summing of the raw score is not
19
acceptable or meaningful. The commonly used NDI raw score is not linear and should likely not
20
be used in interpretation of findings. Proper transformation of the NDI raw score into a linear
21
measure is needed prior to statistical analysis and interpretation, since most statistical procedures
22
(e.g., t-test, ANOVA, Pearson correlation) assume that the scores used for conducting statistical
23
analyses are linear measures.
AC C
18
ACCEPTED MANUSCRIPT 13 1 This analysis confirms previous research and adds to the concerns of NDI usage as a PRO
3
measure. Unfortunately, the NDI does not carry with it a clear interpretation of what a score
4
means. The NDI does not exhibit strong unidimensionality, has serious limitations in regards to
5
its floor effects, and has an unacceptable raw score to measure correlation.
RI PT
2
6 Limitations and Future Research
8
This study exhibits some threats to external validity in that it cannot necessarily be generalizable.
9
The sample consisted of participants that actively chose to visit a university-based spine clinic.
M AN U
SC
7
Therefore, this study has a very targeted sample that has the means and time to visit a clinic.
11
Furthermore, the study only took place in one clinic. The majority of the sample is White and
12
female. As a result, the findings may not be generalizable to other racial/ethnic groups. Future
13
research needs to incorporate a more diverse sample. Finally, future research might look into
14
modifications of the NDI or identify other neck disability instruments that can perform better and
15
to be incorporated into standard of care measurement. As a result, this will allow for more
16
accurate and precise measurement of neck disability and thus, provide better care for those
17
suffering from neck disorders.
AC C
EP
TE D
10
ACCEPTED MANUSCRIPT 14 1
REFERENCES
2 1.
Martin BI, Turner JA, Mirza SK, et al. Trends in health care expenditures, utilization, and
4
health status among US adults with spine problems, 1997-2006. Spine. 2009;34(19):2077-84.
5
2.
6
surgical spine care: patient satisfaction is not a valid proxy. The spine journal : official journal of
7
the North American Spine Society. 2013.
8
3.
9
surgery. The Journal of the American Academy of Orthopaedic Surgeons. 2013;21(2):99-107.
RI PT
3
SC
Godil SS, Parker SL, Zuckerman SL, et al. Determining the quality and effectiveness of
M AN U
McCormick JD, Werner BC, Shimer AL. Patient-reported outcome measures in spine
10
4.
Vernon H. The Neck Disability Index: state-of-the-art, 1991-2008. Journal of
11
manipulative and physiological therapeutics. 2008;31(7):491-502.
12
5.
13
of manipulative and physiological therapeutics. 1991;14(7):409-15.
14
6.
15
responsiveness of the neck disability index, patient-specific functional scale, and numeric pain
16
rating scale in patients with cervical radiculopathy. American journal of physical medicine &
17
rehabilitation / Association of Academic Physiatrists. 2010;89(10):831-9.
18
7.
19
Bournemouth Questionnaire in a sample of patients with chronic uncomplicated neck pain.
20
Journal of manipulative and physiological therapeutics. 2007;30(4):259-62.
21
8.
22
the Neck Disability Index and patient specific functional scale in patients with cervical
23
radiculopathy. Spine. 2006;31(5):598-602.
TE D
Vernon H, Mior S. The Neck Disability Index: a study of reliability and validity. Journal
EP
Young IA, Cleland JA, Michener LA, Brown C. Reliability, construct validity, and
AC C
Gay RE, Madson TJ, Cieslak KR. Comparison of the Neck Disability Index and the Neck
Cleland JA, Fritz JM, Whitman JM, Palmer JA. The reliability and construct validity of
ACCEPTED MANUSCRIPT 15 1
9.
Pool JJ, Ostelo RW, Hoving JL, et al. Minimal clinically important change of the Neck
2
Disability Index and the Numerical Rating Scale for patients with neck pain. Spine.
3
2007;32(26):3047-51.
4
10.
5
into the measurement properties of the neck disability index. Arthritis and rheumatism.
6
2009;61(4):544-51.
7
11.
8
disability index: a systematic review. The Journal of orthopaedic and sports physical therapy.
9
2009;39(5):400-17.
RI PT
van der Velde G, Beaton D, Hogg-Johnston S, et al. Rasch analysis provides new insights
M AN U
SC
MacDermid JC, Walton DM, Avery S, et al. Measurement properties of the neck
10
12.
Hains F, Waalen J, Mior S. Psychometric properties of the neck disability index. Journal
11
of manipulative and physiological therapeutics. 1998;21(2):75-80.
12
13.
13
measures on patients with cervical spine disorders. Physical therapy. 1998;78(9):951-63.
14
14.
15
prerequisite for the assessment of validity. The Neck Disability Index as an example. Journal of
16
Clinical Epidemiology. 2013;66(7):775-82.e2.
17
15.
18
Chicago, IL: MESA Press.
19
16.
20
Psychometrika 58.1 (1993): 87-99.
21
17.
22
Chicago: MESA Press.
TE D
Riddle DL, Stratford PW. Use of generic versus region-specific functional status
EP
Ailliet L, Knol DL, Rubinstein SM, et al. Definition of the construct to be measured is a
AC C
Linacre, J. (2013). A User’s Guide to Winsteps Rasch-Model Computer Programs.
Wilson, M.R., and Masters, G.N. "The partial credit model and null categories"
Wright, B.D., and Masters, G.N. 1982. Rating Scale Analysis: Rasch Measurement.
ACCEPTED MANUSCRIPT 16 1
18.
Wright, B.D., and Panchapakesan, N. 1969. A procedure for sample-free item analysis,
2
Educational and Psychological Measurement, 29, 23–48.
3
19.
4
measurement in the human sciences. Mahwah, NJ, USA: Lawrence Erlbaum.
5
20.
6
sciences: Psychology Press; 2013.
7
21.
Wright BD, Masters GN. Rating Scale Analysis. Rasch Measurement: ERIC; 1982.
8
22.
De Vet HCW, Terwee CB, Mokkink LB, Knol DL. Measurement in Medicine. New
9
York: Cambridge University Press; 2011.
RI PT
Bond, T. G., and Fox, C. M. (2001). Applying the Rasch model: Fundamental
M AN U
SC
Bond TG, Fox CM. Applying the Rasch model: Fundamental measurement in the human
10
23.
Linacre JM. Understanding Rasch Measurement: Optimizing Rating Scale Category
11
Effectiveness. Journal of Applied Measurement. 2002;3(1):85-106.
12
24.
13
and use. Fourth ed. New York: Oxford University Press; 2008.
14
25.
15
evaluate fit to the rasch model. Journal of Outcome Measurement, 2(1), 66-78.
16
26.
17
assessment in foot & ankle research: computerized adaptive testing. Foot & ankle international /
18
American Orthopaedic Foot and Ankle Society [and] Swiss Foot and Ankle Society.
19
2012;33(8):621-6.
20
27.
21
measure short form (PAM-13) in rural settings. Quality of Life Research. 2013, 22(3): 521-9.
22
28.
23
Upper Saddle River, NJ: Pearson/Prentice Hall; 2005.
TE D
Streiner DL, G.R. N. Health Measurement Scales: a practical guide to their development
Smith, Richard M., Schumacker, R. E., Bush, M. J. (1998). Using item mean squares to
AC C
EP
Hung M, Nickisch F, Beals TC, et al. New paradigm for patient-reported outcomes
Hung M, Carter M, Hayden C, et al. Psychometric assessment of the patient activation
Davidshofer K, Murphy C. Psychological Testing: Principles and Applications. 6th ed.
ACCEPTED MANUSCRIPT 17 1
29.
Hung M, Baumhauer JF, Latt LD, et al. Validation of PROMIS Physical Function
2
Computerized Adaptive Tests for Orthopaedic Foot and Ankle Outcome Research. Clinical
3
orthopaedics and related research. 2013, 471(11): 3466-74.
4
30.
5
statistics and principal component analysis of residuals. Journal of Applied Measurement, 2002,
6
3:205–231.
RI PT
Smith E. V. Jr. Detecting and evaluating the impact of multidimensionality using item fit
SC
7
9
FIGURE LEGEND
M AN U
8
Table 1. Demographic characteristics (N=865).
11
Figure 1. Person item histogram.
12
Figure 2. NDI raw score to measure correlation.
13
Appendix 1.Neck Disability Index.
AC C
EP
TE D
10
ACCEPTED MANUSCRIPT
Variables Age
Min 15.4
Max 91.6
Mean (SD) 55.2 (15.9)
n
Percent
497 368
57.5 42.5
Gender
RI PT
Male Female
Symptoms duration Less than 1 month 1 – 3 months 3 – 6 months 6 – 24 months More than 24 moths Missing
AC C
EP
TE D
Treatment since last visit Nothing Surgery Physical therapy Medications Injections Two or more treatment methods Missing
M AN U
White or Caucasian Black or African American American Indian and Alaska Native Native Hawaiian and Other Pacific Islander Asian Other Missing
784 4 10 9 7 20 31
94.0 0.5 1.2 1.1 0.8 2.4
103 143 104 150 202 163
14.7 20.4 14.8 21.4 28.8
160 39 66 166 20 243 171
23.1 5.6 9.5 23.9 2.9 35.0
SC
Race
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
2
Personal Care (Washing, dressing, etc.)
3
Lifting
4
Reading
5
Headache
6
Concentration
7
Work
8
Driving
9
Sleeping
Possible Responses 0. I have no pain at the moment. 1. The pain is very mild at the moment. 2. The pain is moderate at the moment. 3. The pain is fairly severe at the moment. 4. The pain is very severe at the moment. 5. The pain is the worse imaginable at the moment. 0. I can look after myself normally without causing extra pain. 1. I can look after myself normally but it causes me extra pain. 2. It is painful to look after myself and I am slow and careful. 3. I need help but manage most of my personal care 4. I need help every day in most aspects of self-care. 5. I do not get dressed, wash with difficulty and sty in bed. 0. I can lift heavy weights without extra pain. 1. I can lift heavy weights but it gives extra pain. 2. Pain prevents me from lifting heavy weights off the floor, but I can manage if they were conveniently positioned, e.g. on a table 3. Pain prevents me from lifting heavy weights off the floor, but I can manage light to medium weights if they are conveniently positioned. 4. I can lift only very lightweights. 5. I cannot lift or carry anything at all. 0. I can read as much as I want to with no pain in my neck. 1. I can red as much as I want to with slight pain in my neck. 2. I can read as much as I want with moderate pain in my neck. 3. I can’t read as much as I want because of moderate pain in my neck. 4. I can hardly read at all because of severe pain in my neck. 5. I cannot read at all. 0. I have no headache at all. 1. I have slight headaches, which come infrequently. 2. I have moderate headaches, which come infrequently. 3. I have moderate headaches, which come frequently. 4. I have severe headaches, which come frequently. 5. I have headaches almost all the time. 0. I can concentrate fully when I want to with no difficulty. 1. I can concentrate fully when I want to with slight difficulty. 2. I have a fair degree of difficulty in concentrating when I want to. 3. I have a lot of difficulty in concentrating when I want to. 4. I have a great deal of difficulty in concentration when I want to. 5. I cannot concentrate at all. 0. I can do as much as I want. 1. I can only do my usual work but no more. 2. I can do most of usual work, but no more. 3. I cannot do my usual work. 4. I can hardly do any work at all. 5. I can’t do any work at all. 0. I can drive my car without any neck pain. 1. I can drive my car as long as I want with slight pain in my neck. 2. I can drive my car as long as I want with moderate pain in my neck. 3. I can’t drive my car as long as I want because of moderate pain in my neck. 4. I can hardly drive at all because of severe pain in my neck. 5. I can’t drive my car at all. 0. I have no trouble sleeping. 1. My sleep is slightly disturbed (less than 1 hour sleep loss). 2. My sleep is mildly disturbed (1-2 hour sleep loss.)
RI PT
Everyday Activity Pain Intensity
AC C
EP
TE D
M AN U
SC
Section 1
ACCEPTED MANUSCRIPT
2. 3.
EP
TE D
M AN U
SC
4. 5.
My sleep is moderately disturbed (2-3 hours sleep loss). My sleep is greatly disturbed (3-5 hours sleep loss). My sleep is completely disturbed (5-7 hours sleep loss). I am able to engage in all my recreational activities with no neck pain at all. I am able to engage in all my recreational activities with some pain in my neck. I am able to engage in most but not all of my usual recreational activities because of pain in my neck. I am able to engage in a few of my usual recreational activities because of pain in my neck. I can hardly do any recreational activities because of pain in my neck. I can’t do any recreational activates at all.
RI PT
Recreation
AC C
10
3. 4. 5. 0. 1.