Journal of Clinical Neuroscience 22 (2015) 346–351
Contents lists available at ScienceDirect
Journal of Clinical Neuroscience journal homepage: www.elsevier.com/locate/jocn
Clinical Study
A novel classification system of lumbar disc degeneration Ron I. Riesenburger ⇑, Mina G. Safain, Richard Ogbuji, Jackson Hayes, Steven W. Hwang Department of Neurosurgery, Tufts Medical Center, 800 Washington Street #178, Proger 7, Boston, MA 02111, USA Tufts University School of Medicine, Boston, MA, USA
a r t i c l e
i n f o
Article history: Received 18 March 2014 Accepted 25 May 2014
Keywords: Back pain Classification Degenerative disc Disc degeneration
a b s t r a c t The Pfirrmann and modified Pfirrmann grading systems are currently used to classify lumbar disc degeneration. These systems, however, do not incorporate variables that have been associated with lumbar disc degeneration, including Modic changes, a high intensity zone, and a significant reduction in disc height. A system that incorporates these variables that is easy to apply may be useful for research and clinical purposes. A grading system was developed that incorporates disc structure and brightness, presence or absence of Modic changes, presence or absence of a high intensity zone, and reduction in disc height (disc height less than 5 mm). MRI of 300 lumbar discs in 60 patients were analyzed twice by two neurosurgeons. Intra and inter-observer reliabilities were assessed by calculating Cohen’s j values. There were 156 grade zero (‘‘normal’’), 50 grade one, 57 grade two, 26 grade three, 10 grade four, and one grade five (‘‘worst’’) discs. Inter-observer reliability was substantial (j = 0.66 to 0.77) for disc brightness/structure, Modic changes, and disc height. Inter-observer reliability was moderate (j = 0.41) for high intensity zone. Intra-observer reliability was moderate to excellent (j = 0.53 to 0.94) in all categories. Agreement on the total grade between reviewers occurred 71% of the time and a difference of one grade occurred in an additional 25% of cases. Lumbar disc degeneration can be graded reliably by this novel system. The advantage of this system is that it incorporates disc brightness/structure, Modic changes, high intensity zone, and a rigid definition of loss of disc height. This system might be useful in research studies evaluating disc degeneration. Further studies are required to demonstrate possible clinical utility in predicting outcomes after spinal treatments such as fusion. Ó 2014 Elsevier Ltd. All rights reserved.
1. Introduction The Pfirrmann and modified Pfirrmann classification systems can be used to grade lumbar disc degeneration [1,2]. While both systems are reliable, neither is widely used in the literature. The lack of widespread application of both systems may be due to their classification of discs based on structure and signal intensity without accounting for other factors that may be relevant. While both systems attempt to incorporate disc height, neither system does so in a manner that is typically used in clinical practice. In addition, neither system accounts for the presence or absence of Modic changes or a high intensity zone (HIZ). A MRI based classification system that evaluates disc structure and brightness, presence or absence of Modic changes, presence or absence of a high intensity zone, and reduction in disc height may be useful for research and clinical purposes. Loss of disc structure and brightness are the hallmarks of radiographic lumbar disc ⇑ Corresponding author. Tel.: +1 617 636 5858; fax: +1 617 636 7587. E-mail address:
[email protected] (R.I. Riesenburger). http://dx.doi.org/10.1016/j.jocn.2014.05.052 0967-5868/Ó 2014 Elsevier Ltd. All rights reserved.
degeneration [3,4]. Modic changes are also likely an indication of lumbar degeneration. Many studies have concluded that these endplate abnormalities are consistent with bone marrow edema and may cause pain [5–7]. In addition, the HIZ is another indicator of lumbar degeneration. Histological studies have confirmed the HIZ is a tear in the posterior annulus with subsequent infiltration by inflammatory cells [8]. Lastly, reduction in disc height is another hallmark of lumbar degeneration. A loss of disc height greater than 5 mm has been correlated with a favorable response to lumbar fusion [3,9]. We are not aware of a grading system that incorporates these factors and is easy to apply. The lack of a standardized grading system that includes multiple radiographic findings may hinder progress in the study of lumbar disc degeneration. Multiple investigators have studied fusion for lumbar disc degeneration and describe or grade the degree of degeneration in several different ways [10–13]. This lack of standardization makes it difficult to compare data and results from these different studies. Therefore, we feel a standardized, MRI based method of grading lumbar disc degeneration that is easy to apply may be useful. In this paper,
R.I. Riesenburger et al. / Journal of Clinical Neuroscience 22 (2015) 346–351
347
we present and assess the reliability of a classification system that incorporates multiple radiographic indicators to grade the degree of degeneration at each level in the lumbar spine. 2. Methods 2.1. Lumbar disc degeneration classification system A point system for lumbar disc degeneration was developed that included four different categories (Table 1, Fig. 1). (1) Classification of disc structure and brightness: 0 points was assigned for presence of a distinct annulus fibrosis and nucleus pulposus, with nucleus pulposus T2-weighted signal intensity that was isointense to cerebrospinal fluid (CSF), 1 point for lack of a distinction of annulus fibrosis and nucleus pulposus with signal hypointense to CSF on T2-weighted images but not completely dark, and 2 points for lack of a distinction of annulus fibrosis and nucleus pulposus with a completely hypointense signal on T2-weighted images (i.e. a completely ‘‘black’’ or ‘‘dark’’ disc). (2) Modic [5] type I or type II changes: 0 points was assigned if absent, and 1 point if present. (3) HIZ, defined by Berg et al. [14] as an ‘‘area of high-signal intensity in the posterior annulus fibrosis that is brighter than the nucleus pulposus on T2-weighted images and is surrounded superiorly, inferiorly and anteriorly by the low-intensity (black) signal of the annulus fibrosis’’. Zero points was assigned if absent, and 1 point if present. (4) Disc height: 0 points was assigned if greater than or equal to 5 mm, 1 point if less than 5 mm. The points from each of the four categories were added to give each lumbar disc a score (possible range 0–5): grade zero (normal level) = 0 points, grade one = 1 point, grade two = 2 points, grade three = 3 points, grade four = 4 points, and grade five (severely degenerated level) = 5 points. 2.2. Rationale for this classification system 2.2.1. Disc height The Pfirrmann and modified Pfirrmann classification systems are both reliable systems currently used to classify lumbar disc degeneration [1,2]. Both systems grade lumbar disc degeneration based on disc structure, signal intensity, and disc height. The advantage of the Pfirrmann system is its simplicity, with only five grades (Grade I to Grade V), however the disadvantage is the subjective grading of disc height. The modified Pfirrmann system
Table 1 Lumbar disc degeneration classification system. Radiographic indicator of disc degeneration
Description
Points
Disc structure and brightness
Presence of a distinct annulus fibrosis and nucleus pulposus; nucleus pulposus T2weighted signal isointense to CSF
0
Lack of a distinction of annulus fibrosis and nucleus pulposus; nucleus pulposus T2weighted signal hypointense to CSF but not completely black
1
Lack of a distinction of annulus fibrosis and nucleus pulposus; nucleus pulposus T2weighted signal completely hypointense (black or dark disk) No Type I or Type II changes Type I or Type II changes present Absent Present Greater or equal to 5 mm Less than 5 mm
2
Modic changes High intensity zone Disc height
CSF = cerebrospinal fluid.
0 1 0 1 0 1
Fig. 1. Multiple sagittal T2-weighted MRI demonstrating several discs and our grading system. Top row: disc brightness (DB) with discs graded 0, 1, and 2. Second row: Modic changes (MC) with discs graded 0 (absent) or 1 (present) (white arrows). Third row: High intensity zone (HIZ) with discs graded 0 (absent) or 1 (present) (white arrow). Bottom row: Disc height (DH) with discs graded 0 (greater or equal to 5 mm) or 1 (less than 5 mm).
addresses this shortcoming by providing a more objective description of disc height based on percent collapse, but translates into a more cumbersome system with eight different grades. Furthermore, Griffith et al. [2] do not provide a clear rationale for using 30% and 60% reduction in disc height to demarcate different grades of disc degeneration. We think the measured height of the disc is more useful than describing the percent reduction in disc height. Two different groups have demonstrated that patients with low back pain presumed to be secondary to disc degeneration had a more favorable outcome following lumbar fusion with a disc height less than 5 mm. Schuler et al. [9] retrospectively analyzed outcomes in 392 patients with discogenic pain undergoing a fusion and noted that patients with a pre-operative disc height of less than 5 mm had the best clinical improvement. Djurasovic et al. [3] retrospectively studied multiple radiographic factors following lumbar fusion and identified that patients undergoing lumbar fusion with a preoperative disc height less than 5 mm demonstrated the greatest improvements in clinical outcome measures. Given the consistent findings of these two studies, we assigned 1 point to any disc with a height less than 5 mm. Zero points were assigned for discs with a height equal to or greater than 5 mm.
2.2.2. Disc structure and signal intensity To classify disc structure and signal intensity, we felt the five levels of the Pfirrmann [1] grading system could be further simplified. In our system, we assign 0 points for presence of a distinct annulus fibrosis and nucleus pulposus, with nucleus pulposus
348
R.I. Riesenburger et al. / Journal of Clinical Neuroscience 22 (2015) 346–351
signal intensity that is isointense to CSF. This essentially groups together Pfirrmann Grade I and Grade II discs. We assigned 1 point for lack of a distinction of annulus fibrosis and nucleus pulposus with intermediate signal intensity. This roughly corresponds to Pfirrmann Grade III and Grade IV discs. We then gave 2 points for lack of a distinction of annulus fibrosis and nucleus pulposus with a completely hypointense signal. This is essentially a completely ‘‘dark’’ or ‘‘black’’ disc and corresponds to the disc intensity and structure of a Pfirrmann Grade V disc.
2.2.3. Modic changes Modic changes were also included in this classification system, as several investigators believe they are an indicator of lumbar degeneration. Modic et al. [5] introduced a system and rationale for describing endplate changes in 1988. We assigned 0 points in the absence of Modic changes (or the rare presence of Modic type 3 changes that do not represent active degeneration) and 1 point for the presence of Modic type 1 or type 2 changes. We feel it is justified to include Modic changes in a classification system of lumbar disc degeneration. While the clinical significance of Modic changes is debated in the literature [15–18], we feel the histopathological changes demonstrated by Modic et al. provide evidence that these endplate changes represent degeneration. In addition, Modic et al. noted that all endplates with type 1 or type 2 Modic changes were adjacent to degenerated discs. We also feel it is justified to assign 1 point for either Modic type 1 or type 2 changes for several reasons. First, while some investigators have concluded that type 1 changes are more likely to be clinically relevant than type 2 changes, this has not been definitively demonstrated to our knowledge. In addition, Modic et al. initially concluded that type 1 and type 2 changes may ‘‘reflect a spectrum’’ of changes based on their longitudinal study of several patients with endplate changes. Furthermore, several investigators have noted that mixed type 1 and type 2 Modic changes are commonly found [16,19–21]. Fayad et al. [22] demonstrated that a senior radiologist reviewing discs with Modic changes reported a mixed type 1-2 or mixed type 2-1 36.8% of the time. This implies that in over one-third of cases, because of the presence of mixed changes, it is not possible to classify endplate changes as solely Modic type 1 or type 2. Adding mixed Modic changes to our system is, therefore, not justified due to the complexity it would add to the grading system. For this reason, we decided to simply assign 1 point for the presence of a Modic type 1 or type 2 changes.
2.2.4. HIZ April and Bogduk [23] described the HIZ annular tear in 1992 and reported concordant pain in 38 of 40 patients with HIZ lesions undergoing discography. In addition, Chen et al. [24] reported concordant pain in 52 out of 60 patients with HIZ lesions undergoing discography. However, several question the significance of any conclusions made from discography testing [3,25–27]. The clinical significance of these HIZ lesions has also been challenged [7,28]. Peng et al. [8] histologically examined 11 HIZ harvested during posterior lumbar interbody fusion operations for discogenic back pain. They noted that all specimens showed ‘‘that the normal lamellar structure was replaced by a disorganized, vascularized granulation tissue that consisted of small round cells, fibroblasts, and newly formed blood vessels.’’ They also observed ‘‘inflammatory cell infiltration seen extending along the margins of the tear into the middle and inner annulus.’’ We feel that these histological findings provide evidence that the HIZ represents one aspect of lumbar disc degeneration and justify including HIZ in our classification system.
2.3. Patient selection and demographics This study was approved by the Tufts Medical Center Institutional Review Board. Sixty consecutive lumbar MRI of patients seen in our outpatient spine clinic were chosen for review. We eliminated patients with spondylolisthesis, a spine tumor, prior lumbar surgery, or a spine fracture. Two attending neurosurgeons (R.R. and S.H.) then graded all 300 discs in the 60 lumbar MRI. To complete this study, we used a protocol similar to that described by Pfirrmann et al. [1]. Three randomly ordered sets of 20 MRI were created. Each neurosurgeon was allowed to review only one set of 20 MRI per day to avoid rater fatigue. This is the approximate amount of MRI that would be reviewed on a typical clinic day. This was repeated for the additional two sets of 20 MRI. One month later, the same protocol was repeated and all 300 discs were again graded by both neurosurgeons. Each neurosurgeon again was limited to reviewing 20 MRI per day.
2.4. Statistical analysis Statistical analysis was completed using JMP 10.0 statistical software package (SAS Institute, Cary, NC, USA). Reliability of the grading system was assessed by calculating Cohen’s j values within raters (intra-observer reliability) and between raters (inter-observer reliability). Agreement was graded in the following manner according to Landis and Koch [29]: j 0 to 0.2 indicates slight agreement, 0.21 to 0.4 indicates fair agreement, 0.41 to 0.60 indicates moderate agreement, 0.61 to 0.80 indicates substantial agreement, 0.81 to 0.99 indicates excellent agreement, and 1 indicates perfect agreement. Percentage of observer agreement was calculated for each grade.
3. Results 3.1. Patient demographics and disc grades Three hundred lumbar discs from 60 consecutive patients were reviewed. The mean patient age was 52 ± 17.6 years (range 19 to 91). There were 28 males and 32 females. Reviewer 1 graded 156 (52%) discs as grade zero, 50 (17%) discs grade one, 57 (19%) discs grade two, 26 (9%) discs grade three, 10 (3%) discs grade four, and one (0.3%) disc grade five (Table 2). Reviewer 2 graded 143 (48%) discs as grade zero, 77 (26%) discs grade one, 57 (19%) discs grade two, 15 (5%) discs grade three, eight (3%) discs grade four, and zero (0%) discs grade five (Table 2).
3.2. Inter-observer reliability Inter-observer reliability was calculated between each reviewer for each category within our grading system as well as for the total disc grade (Table 3). During the first assessment the two reviewers had substantial agreement on disc brightness (j = 0.69), Modic changes (j = 0.66), and disc height (j = 0.77). Moderate agreement was observed for HIZ (j = 0.41), and total grade (j = 0.57) (Table 3).
Table 2 Reviewer grading of discs Reviewer 1 2
Grades
0
1
2
3
4
5
156 143
50 77
57 57
26 15
10 8
1 0
349
R.I. Riesenburger et al. / Journal of Clinical Neuroscience 22 (2015) 346–351 Table 3 Intra-observer and inter-observer reliability using Cohen’s kappa values Reviewer
Disc brightness
Modic changes
High intensity zone
Disc height
Total grade
Reviewer 1 vs Reviewer 2: 1st assessment Reviewer 1: 1st assessment vs 2nd assessment Reviewer 2: 1st assessment vs 2nd assessment
0.69 0.78 0.66
0.66 0.83 0.71
0.41 0.75 0.53
0.77 0.94 0.54
0.57 0.71 0.58
vs = versus.
3.3. Intra-observer reliability Intra-observer reliability was calculated for each reviewer for each category within our grading system as well as for the total disc grade (Table 3). Reviewer 1 demonstrated substantial to excellent agreement among the two assessments with j values of 0.78 for disc brightness, 0.83 for Modic changes, 0.75 for high intensity zone, 0.94 for disc height, and 0.71 for the total grade of the disc. Reviewer 2 demonstrated substantial agreement for disc brightness (j = 0.66) and Modic changes (j = 0.71). He demonstrated moderate agreement for HIZ (j = 0.53), disc height (j = 0.54), and the total grade (j = 0.58). 3.4. Agreement percentage Percentage of agreement was calculated for each reviewer as well as between reviewers (Table 4). During the first assessment Reviewer 1 and Reviewer 2 agreed on the total grade of the disc 71% of the time (213/300). Out of the 87 discs that there was a disagreement on, an absolute difference of one grade occurred in 76 (25%) of these discs, a difference of two grades occurred in 10 (3%) discs, and a difference of three grades occurred in one patient (0.3%). Reviewer 1 agreed on the total grade of the disc between his two assessments 81% of the time (244/300). Out of the 56 discs that there was a disagreement on, an absolute difference of one grade occurred in 50 (16%) of these discs and a difference of two grades occurred in six (2%) discs. Reviewer 2 agreed on the total grade of the disc 72% of the time (216/300). Out of the 84 discs that there was a disagreement on, an absolute difference of one grade occurred in 73 (24%) of these discs and a difference of two grades occurred in 11 (4%) discs. 4. Discussion 4.1. Reliability of this classification system For a classification system to be considered useful and reliable it should be comprehensive and encompass the major features of
Table 4 Agreement percentage on total grade within and between reviewers Reviewer
Percentage agreement
Absolute difference in grade
Reviewer 1 vs Reviewer 2: 1st assessment
213/300 (71%) D 1 grade – 76/300 (25%)
D 2 grade – 10/300 (3%) D 3 grade – 1/300 (0.3%) Reviewer 1: 1st assessment vs 244/300 (81%) D 1 grade – 50/300 (16%) 2nd assessment D 2 grade – 6/300 (2%) Reviewer 2: 1st assessment vs 216/300 (72%) D 1 grade – 73/300 (24%) 2nd assessment D 2 grade – 11/300 (4%) vs = versus.
the disease process. In addition, it should be easily reproducible within the same reviewer as well as other clinicians. Finally, it should be quick and easy to apply. It is challenging to create a system that is comprehensive and not cumbersome. Adding too much detail to a system may decrease its utility if it becomes too difficult and time-consuming to apply. We tried to strike a balance and incorporate several radiographic parameters associated with disc degeneration and possibly pain into a system that can easily be completed. Reliability and reproducibility is a major driving force for the utility of any classification system. Inter and intra-observer reliability demonstrated substantial to excellent agreement for the majority of the grading system parts and the overall total grade (Table 3). When a disagreement occurred over the total grade of the disc it was a difference of one grade the vast majority of the time (Table 4). Inter-observer reliability similarly provided good results with moderate to substantial agreement among the two reviewers on both their first and second assessments. Once again, when a disagreement occurred between reviewers the vast majority were a difference of one grade (Table 4). These results are comparable to previous grading systems of lumbar disc degeneration such as the Pfirrmann and modified Pfirrmann classification systems (Table 5).1,2 Furthermore, these agreement values are also comparable to a previous study grading gross lumbar disc degeneration specimens [30]. 4.2. Possible utility of this classification system We think this system can be useful for research studies evaluating lumbar disc degeneration. Many studies have identified radiographic parameters associated with back pain with the ultimate goal of predicting which patients would benefit from surgery. Currently spinal fusions for degenerative discs and back pain alone generally have a less reliable outcome than other spinal procedures such as discectomies [31–33]. Although many radiographic variables have been correlated with post-fusion improvement in back pain, the indications remain highly variable. The lack of a comprehensive grading system may be one of the limitations in our current evaluation and we hypothesize that our grading system may help bridge this limitation. This classification system could also potentially provide a standardized measure of overall disc degeneration to aid in evaluating outcomes after treatments for lumbar disc degeneration. Several studies have retrospectively evaluated whether Modic changes, loss of disc height, or HIZ correlates with outcomes after fusion for discogenic back pain [3,10–14,34]. These studies tend to evaluate each factor in isolation. We are not aware of a study that was able to combine all these factors and correlate outcomes with all the features of a degenerated disc. This classification system could potentially be used for that purpose. Most experts cite only four high quality prospective outcomes studies in the literature that investigate the utility of lumbar fusion for the treatment of low back pain [10–13]. Their radiographic definitions of lumbar disc degeneration are surprisingly lacking in detail and are not standardized. Fritzell et al. [10] included patients with ‘‘[d]egenerative changes at L4-L5 and/or L5-S1 (‘‘spondylosis’’)
350
R.I. Riesenburger et al. / Journal of Clinical Neuroscience 22 (2015) 346–351
Table 5 Comparison of grading systems Current manuscript
Score
Disc structure and brightness Presence of a distinct annulus fibrosis and nucleus pulposus; nucleus pulposus T2-weighted signal isointense to CSF Lack of a distinction of annulus fibrosis and nucleus pulposus; nucleus pulposus T2-weighted signal hypointense to CSF but not completely black Lack of a distinction of annulus fibrosis and nucleus pulposus; nucleus pulposus T2weighted signal completely hypointense (black or dark disk)
0 1 2
Modic changes No Type I or Type II changes Type I or Type II changes present
0 1
High intensity zone Absent Present
0 1
Disc height Greater or equal to 5 mm Less than 5 mm
0 1
Total
0–5
Pfirrmann
Modified Pfirrmann
Grade 1 1.) Homogenous, bright white structure 2.) Clear nucleus and anulus 3.) Hyperintense, isointense to CSF 4.) Normal disc height Grade 2 1.) Inhomogenous with or without horizontal bands 2.) Clear nucleus and anulus 3.) Hyperintense, isointense to CSF 4.) Normal disc height
Grade 1 1.) Uniformly hyperintense, equal to CSF 2.) Distinct anulus 3.) Normal disc height Grade 2 1.) Hyperintense 2.) Distinct anulus 3.) Normal disc height Grade 3 1.) Hyperintense but lest than presacral fat 2.) Distinct anulus 3.) Normal disc height
Grade 3 1.) Inhomogeneous, gray 2.) Unclear nucleus and anulus 3.) Intermediate to CSF 4.) Normal to slightly decreased disc space
Grade 4 1.) Mildly hyperintense 2.) Indistinct anulus 3.) Normal disc height Grade 5 1.) Hypointense 2.) Indistinct anulus 3.) Normal disc height Grade 6 1.) Hypointense 2.) Indistinct anulus 3.) <30% reduction in disc height Grade 7 1.) Hypointense 2.) Indistinct anulus 3.) 30–60% reduction in disc height Grade 8 1.) Hypointense 2.) Indistinct anulus 3.) >60% reduction in disc height 1–8
Grade 4 1.) Inhomogeneous, gray to black 2.) Lost nucleus and anulus 3.) Intermediate to hypointense to CSF 4.) Normal to moderately decreased disc space
Grade 5 1.) Inhomogeneous black 2.) Lost nucleus and anulus 3.) Hypointense to CSF 4.) Collapsed disc space 1–5
CSF = cerebrospinal fluid.
on plain radiographs and/or computed tomography (CT), and or magnetic resonance imaging (MRI).’’ Brox et al. [11] and Brox et al. [12] included patients with ‘‘[d]egeneration at L4-L5 and/or L5-S1 (spondylosis) on plain radiographs.’’ Fairbank et al. [13] did not provide any radiographic definition of disc degeneration in their prospective, randomized, multicenter trial comparing lumbar fusion with an intensive rehabilitation program. They included ‘‘[p]atients who were candidates for surgical stabilization of the spine [who] were eligible if the clinician and patient were uncertain which of the study treatment strategies was best. Patients had to be aged between 18 and 55, with more than a 12 month history of chronic low back pain.’’ Unfortunately, limited conclusions can be drawn from these studies as they do not stratify the degree of lumbar disc degeneration with a classification system. Our classification system is very easy to apply and could potentially be used in these studies. While we have demonstrated the reliability of our system, we have not validated it clinically. This is a major limitation of the current study. The goal of the current manuscript is to demonstrate that the system is reliable within and between reviewers. The current data that we have on this patient cohort does not allow us to relate the grading system to patient symptoms or outcomes. This, however, is an important future step in validating this grading system. Future prospective studies utilizing patient symptoms as well as outcome grading scales will have to be conducted to prove the usefulness of this grading system.
5. Conclusions We present a novel lumbar disc degeneration classification system and demonstrate its reliability within and between reviewers. This system may demonstrate clinical and research benefit in the future. Conflicts of Interest/Disclosures The authors declare that they have no financial or other conflicts of interest in relation to this research and its publication. References [1] Pfirrmann CW, Metzdorf A, Zanetti M, et al. Magnetic resonance classification of lumbar intervertebral disc degeneration. Spine (Phila Pa 1976) 2001;26: 1873–8. [2] Griffith JF, Wang YX, Antonio GE, et al. Modified Pfirrmann grading system for lumbar intervertebral disc degeneration. Spine 2007;32:E708–12. [3] Djurasovic M, Carreon LY, Crawford 3rd CH, et al. The influence of preoperative MRI findings on lumbar fusion clinical outcomes. Eur Spine J 2012;21: 1616–23. [4] Kettler A, Wilke HJ. Review of existing grading systems for cervical or lumbar disc and facet joint degeneration. Eur Spine J 2006;15:705–18. [5] Modic MT, Steinberg PM, Ross JS, et al. Degenerative disk disease: assessment of changes in vertebral body marrow with MR imaging. Radiology 1988;166: 193–9. [6] Emch TM, Modic MT. Imaging of lumbar degenerative disk disease: history and current state. Skeletal Radiol 2011;40:1175–89.
R.I. Riesenburger et al. / Journal of Clinical Neuroscience 22 (2015) 346–351 [7] Jensen TS, Karppinen J, Sorensen JS, et al. Vertebral endplate signal changes (Modic change): a systematic literature review of prevalence and association with non-specific low back pain. Eur Spine J 2008;17:1407–22. [8] Peng B, Hou S, Wu W, et al. The pathogenesis and clinical significance of a highintensity zone (HIZ) of lumbar intervertebral disc on MR imaging in the patient with discogenic low back pain. Eur Spine J 2006;15:583–7. [9] Schuler TC, Burkus JK, Gornet MF, et al. The correlation between preoperative disc space height and clinical outcomes after anterior lumbar interbody fusion. J Spinal Disord Tech 2005;18:396–401. [10] Fritzell P, Hagg O, Wessberg P, et al. 2001 Volvo award winner in clinical studies: lumbar fusion versus nonsurgical treatment for chronic low back pain: a multicenter randomized controlled trial from the Swedish lumbar spine study group. Spine 2001;26:2521–32. [11] Brox JI, Sorensen R, Friis A, et al. Randomized clinical trial of lumbar instrumented fusion and cognitive intervention and exercises in patients with chronic low back pain and disc degeneration. Spine (Phila Pa 1976) 2003;28:1913–21. [12] Brox JI, Reikeras O, Nygaard O, et al. Lumbar instrumented fusion compared with cognitive intervention and exercises in patients with chronic back pain after previous surgery for disc herniation: a prospective randomized controlled study. Pain 2006;122:145–55. [13] Fairbank J, Frost H, Wilson-MacDonald J, et al. Randomised controlled trial to compare surgical stabilisation of the lumbar spine with an intensive rehabilitation programme for patients with chronic low back pain: the MRC spine stabilisation trial. BMJ 2005;330:1233. [14] Berg L, Neckelmann G, Gjertsen O, et al. Reliability of MRI findings in candidates for lumbar disc prosthesis. Neuroradiology 2012;54:699–707. [15] Braithwaite I, White J, Saifuddin A, et al. Vertebral end-plate (Modic) changes on lumbar spine MRI: correlation with pain reproduction at lumbar discography. Eur Spine J 1998;7:363–8. [16] Weishaupt D, Zanetti M, Hodler J, et al. Painful lumbar disk derangement: relevance of endplate abnormalities at MR imaging. Radiology 2001;218: 420–7. [17] Kokkonen SM, Kurunlahti M, Tervonen O, et al. Endplate degeneration observed on magnetic resonance imaging of the lumbar spine: correlation with pain provocation and disc changes observed on computed tomography diskography. Spine (Phila Pa 1976) 2002;27:2274–8. [18] Sandhu HS, Sanchez-Caso LP, Parvataneni HK, et al. Association between findings of provocative discography and vertebral endplate signal changes as seen on MRI. J Spinal Disord 2000;13:438–43.
351
[19] Fayad F, Lefevre-Colau MM, Rannou F, et al. Relation of inflammatory Modic changes to intradiscal steroid injection outcome in chronic low back pain. Eur Spine J 2007;16:925–31. [20] Mitra D, Cassar-Pullicino VN, McCall IW. Longitudinal study of vertebral type1 end-plate changes on MR of the lumbar spine. Eur Radiol 2004;14:1574–81. [21] Kuisma M, Karppinen J, Niinimaki J, et al. A three-year follow-up of lumbar spine endplate (Modic) changes. Spine (Phila Pa 1976) 2006;31:1714–8. [22] Fayad F, Lefevre-Colau MM, Drape JL, et al. Reliability of a modified Modic classification of bone marrow changes in lumbar spine MRI. Joint Bone Spine 2009;76:286–9. [23] Aprill C, Bogduk N. High-intensity zone: a diagnostic sign of painful lumbar disc on magnetic resonance imaging. Br J Radiol 1992;65:361–9. [24] Chen JY, Ding Y, Lv RY, et al. Correlation between MR imaging and discography with provocative concordant pain in patients with low back pain. Clin J Pain 2011;27:125–30. [25] Willems PC, Staal JB, Walenkamp GH, et al. Spinal fusion for chronic low back pain: systematic review on the accuracy of tests for patient selection. Spine J 2013;13:99–109. [26] Wichman HJ. Discography: over 50 years of controversy. WMJ 2007;106:27–9. [27] Carragee EJ, Alamin TF. Discography. a review. Spine J 2001;1:364–72. [28] Buirski G, Silberstein M. The symptomatic lumbar disc in patients with lowback pain. Magnetic resonance imaging appearances in both a symptomatic and control population. Spine (Phila Pa 1976) 1993;18:1808–11. [29] Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics 1977;33:159–74. [30] Thompson JP, Pearce RH, Schechter MT, et al. Preliminary evaluation of a scheme for grading the gross morphology of the human intervertebral disc. Spine (Phila Pa 1976) 1990;15:411–5. [31] Chou R, Baisden J, Carragee EJ, et al. Surgery for low back pain: a review of the evidence for an American Pain Society Clinical Practice Guideline. Spine (Phila Pa 1976) 2009;34:1094–109. [32] Lavelle W, Carl A, Lavelle ED. Invasive and minimally invasive surgical techniques for back pain conditions. Med Clin North Am 2007;91:287–98. [33] van Tulder MW, Koes B, Seitsalo S, et al. Outcome of invasive treatment modalities on back pain and sciatica: an evidence-based review. Eur Spine J 2006;15:S82–92. [34] Ohtori S, Kinoshita T, Yamashita M, et al. Results of surgery for discogenic low back pain: a randomized study using discography versus disco block for diagnosis. Spine (Phila Pa 1976) 2009;34:1345–8.