Pain Sensitivity and Autonomic Factors Associated With Development of TMD: The OPPERA Prospective Cohort Study

Pain Sensitivity and Autonomic Factors Associated With Development of TMD: The OPPERA Prospective Cohort Study

The Journal of Pain, Vol 14, No 12 (December), Suppl. 2, 2013: pp T63-T74 Available online at www.jpain.org and www.sciencedirect.com Pain Sensitivit...

868KB Sizes 0 Downloads 19 Views

The Journal of Pain, Vol 14, No 12 (December), Suppl. 2, 2013: pp T63-T74 Available online at www.jpain.org and www.sciencedirect.com

Pain Sensitivity and Autonomic Factors Associated With Development of TMD: The OPPERA Prospective Cohort Study Joel D. Greenspan,* Gary D. Slade,y,z,x Eric Bair,x,jj,{ Ronald Dubner,* Roger B. Fillingim,** Richard Ohrbach,yy Charles Knott,zz Luda Diatchenko,x,{,xx,jjjj Qian Liu,jj and William Maixnerx,{,{{ *Departments of Oral and Maxillofacial Surgery and Neural and Pain Sciences, and Brotman Facial Pain Center, University of Maryland School of Dentistry, Baltimore, Maryland. Departments of yDental Ecology, zEpidemiology, jjBiostatistics, {Endodontics, and {{Pharmacology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina. x Regional Center for Neurosensory Disorders, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina. **Department of Community Dentistry and Behavioral Science, University of Florida, College of Dentistry, and Pain Research and Intervention Center of Excellence, Gainesville, Florida. yy Department of Oral Diagnostic Sciences, University at Buffalo, Buffalo, New York. zz Battelle Memorial Institute, Durham, North Carolina. xx Carolina Center for Genome Sciences, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina. jjjj Department of Anesthesia, and Alan Edwards Centre for Research on Pain, McGill University, Montreal, Quebec, Canada.

Abstract: Multiple studies report that individuals with chronic temporomandibular disorder (TMD) have enhanced sensitivity to experimental pain. Additionally, chronic TMD cases show altered autonomic function, including elevated heart rate and reduced heart rate variability. However, causal inferences regarding the association between TMD and pain sensitivity and autonomic function cannot be drawn from these cross-sectional observations. The prospective Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) study examines whether measures of pain sensitivity or cardiac autonomic function provide predictive value in TMD incidence. A cohort of 2,737 initially TMD-free participants was followed for up to 5.2 years, during which time 260 developed first-onset TMD. Fourteen of 39 experimental pain measures produced significant hazard ratios, such that greater pain sensitivity was associated with greater TMD incidence. A single autonomic measure—heart rate at rest—was also associated significantly with greater TMD incidence. In contrast, using the same measures of pain sensitivity and cardiac autonomic function, we previously reported a larger group of variables that was significantly associated with chronic TMD in the OPPERA case-control study. Future studies should investigate whether premorbid pain sensitivity or autonomic function more specifically predicts risk of developing chronic TMD than first-onset TMD. Perspective: Our previous case-control studies showed that associations with both pain sensitivity and cardiac autonomic function are profound in chronic TMD cases. Here we show that some measures of enhanced pain sensitivity contribute modestly to the risk of developing TMD whereas autonomic dysregulation appears to confer little or no risk for TMD incidence. ª 2013 by the American Pain Society Key words: Quantitative sensory testing, temporomandibular disorder, orofacial pain, heat pain, pressure pain, cardiovascular measures.

Publication of this supplement was made possible with support of the National Institutes of Healthgrant U01DE017018. The OPPERA program also acknowledges resources specifically provided for this project by the participating institutions: Battelle Memorial Institute; University at Buffalo; University of Florida; University of Maryland; University of North Carolina at Chapel Hill. R.B.F. and G.D.S. are consultants and equity stock holders, and W.M. and L.D. are cofounders and equity stock holders in Algynomics, Inc, a company providing research services in personalized pain medication and diagnostics. Other authors declare no conflicts of interest.

Supplementary data accompanying this article are available online at www.jpain.org and www.sciencedirect.com. Address reprint requests to Joel D. Greenspan, University of Maryland, School of Dentistry, 650 W. Baltimore Street, 8th Floor South, Baltimore, MD 21201. E-mail: [email protected] 1526-5900/$36.00 ª 2013 by the American Pain Society http://dx.doi.org/10.1016/j.jpain.2013.06.007

T63

T64

The Journal of Pain

C

onsiderable evidence has accumulated showing that physiological factors, including altered pain sensitivity and autonomic dysfunction, are associated with several chronic pain conditions, including temporomandibular disorder (TMD).15 Specifically, in case-control studies, individuals with chronic TMD have been found to have greater pain sensitivity to experimental pain than TMD-free controls, even when experimental pain was evaluated at asymptomatic body sites.7,16,17,23,26 This enhanced pain sensitivity has been documented for different pain domains (ie, both cutaneous and deep tissue evoked pain) and for different types of measures (ie, threshold and suprathreshold measures of pain). Additionally, individuals with chronic TMD have been found to have dysregulated autonomic system function, including higher cardiac output and lower total vascular resistance,11 as well as higher cardiosympathetic tone both at rest and under stress.18 Although such case-control studies are informative about chronic pain associations, they cannot address the critical causal questions. Specifically, does the heightened pain sensitivity and autonomic dysregulation precede the development of chronic TMD, or do these characteristics develop as a consequence of having TMD for some period of time? Evidence to address these questions comes from prospective cohort studies that first measure pain sensitivity and autonomic function in people without TMD, and then monitor them prospectively to document those who develop TMD. To our knowledge, only 1 such study has been reported. The study found that heightened pain sensitivity was a predictive factor for TMD among a cohort of 171 women followed for 3 years.25 The current report describes results from the first large-scale prospective cohort study of TMD, evaluating the contribution of enhanced pain sensitivity and autonomic dysregulation to the incidence of TMD. The aim of the study described here was to identify which, if any, measures of experimental pain sensitivity and autonomic function would serve to predict the development of painful TMD, when assessed prior to TMD onset. The studies reported here are a part of the larger parent project Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA), which also included a previously reported case-control study5 in which heightened pain sensitivity and autonomic dysregulation were found to be associated with chronic TMD.

Methods This section summarizes the study methods that are explained in detail in the Supplementary Materials section and elsewhere in this issue.2 Study participants were verbally informed of the study and provided signed consents. The OPPERA project was reviewed and approved by institutional review boards at each of 4 study sites and at the data coordinating center, Battelle Memorial Institute.

Study Design, Setting, and Participants This paper reports findings from the OPPERA prospective cohort study of 2,737 participants who were enrolled

QST and Autonomic Risk Factors for First-Onset TMD in 2006 to 2008 and followed for up to 5.2 years, during which time 260 people developed painful TMD. When enrolled, the sample of community-based volunteers at 4 U.S. study sites was aged 18 to 44 years and did not have TMD when examined using OPPERA’s adaptation of a restricted set of research diagnostic criteria for TMD (RDC/TMD).19 At enrollment, study participants also completed questionnaires, autonomic function was measured, sensitivity to sensory stimuli was evaluated, and a blood sample was collected for genotyping. This paper focuses on baseline measurements of pain sensitivity and autonomic function, which have been described previously.7,18 Detailed specifications of measurement protocols can be found in the Supplementary Materials accompanying this article. Briefly, pain sensitivity tests encompassed 3 stimulus modalities: 1) pressure pain thresholds (PPTs) on 5 body sites—the center of the temporalis muscle, the center of the masseter muscle, overlying the temporomandibular joint, the center of the trapezius muscle, and overlying the lateral epicondyle; 2) cutaneous mechanical pain sensitivity, involving measures of threshold, ratings of suprathreshold stimuli, temporal summation, and aftersensations, assessed on the hand; and 3) heat pain sensitivity, involving threshold, tolerance, ratings of suprathreshold stimuli, temporal summation, and aftersensations, assessed on the forearm. Autonomic measures included arterial blood pressure, heart rate, and heart rate variability measures taken under rest and during 2 provocative conditions: an orthostatic challenge and a Stroop protocol. At 3-month intervals after enrollment, study participants were asked to complete a screening questionnaire that asked about TMD pain symptoms. Those reporting symptoms were invited to study clinics for a follow-up examination that determined presence or absence of TMD using the same adaptation of RDC/TMD criteria. Specifically, 260 incident cases satisfied 2 criteria for TMD: 1) symptoms of orofacial pain reported for $5 days/month and 2) examiner findings of TMD myalgia, arthralgia, or both.

Statistical Analysis For descriptive purposes, the average annual incidence of first-onset TMD was calculated as the number of people with first-onset TMD divided by the sum of follow-up intervals. To test hypotheses about associations between baseline characteristics and TMD incidence, univariate hazard ratios were first computed using Cox proportional hazard regression. For each baseline risk factor, scores were transformed to unit-normal deviates (mean = 0, standard deviation [SD] = 1). Unit-normal deviates were reversed (ie, the sign was changed) for measures of pain threshold, so that higher values represented greater sensitivity to the stimulus. Hazard ratios were computed both with adjustment for study site and with additional adjustment for demographics (age, gender, race/ethnicity, and lifetime U.S. residence). Hazard ratios were also computed using multiple imputation to account for 2 sources of potential bias associated with 1) nonexamination of 243 people with

Greenspan et al symptoms and 2) a higher-than-expected rate of TMD classification for 1 examiner who conducted 75 examinations. Collectively, these are described below as results that are ‘‘imputed for examinations that were not done as intended.’’ A principal limitation of this univariate approach is that since some of the individual variables are correlated, those associations will not be independent. Furthermore, hazard ratios were considered nominally significant if the 95% confidence interval (CI) did not include the value of 1.0. This would equate to a criterion of P < .05. Other group differences were also noted as significant at P < .05. Although this criterion does not account for multiple testing, the statistical results are presented in a full manner in the tables, to allow the reader to consider the significance under other criteria. Note that if we were to apply a conventional Bonferroni adjustment for the 39 variables reported in Tables 1 and 2, the critical P value would be P < .00128. The second stage of the analysis involved a series of multivariable Cox regression models in which the entered variables were derived from a principal components analysis (PCA). The PCAs, performed separately for quantitative sensory testing (QST) and autonomic data, as reported previously, were performed in order to reduce the number of variables by virtue of being highly correlated. The goal of Cox multivariable modeling was to assess independent contributions of each principal component to TMD incidence. In this context, an ‘‘independent’’ contribution is one that is not confounded by the other components. Hazard ratios for the associations were estimated in a single Cox regression model that included all components as predictor variables, together with study site and sociodemographic characteristics. First, all factors were entered into a multivariable Cox regression model simultaneously in order to identify the factors that independently predicted development of TMD. Second, interactions of each factor score with demographic variables in predicting TMD development were examined in Cox regression models. Third, all possible 2-way interactions between factor scores were examined in regression models. This more traditional multivariable approach allowed us to examine whether these components, either alone or via interactions with other factors, predicted TMD onset. The final stage of the analysis involved an alternative multivariable approach, random forest modeling, to analyze potential contributions of all variables, not merely the reduced set of principal components. This cutting-edge analytic approach was used to achieve 2 goals: 1) to identify the most important risk factors for first-onset TMD at the individual variable level and 2) to generate plots depicting adjusted association between each variable and TMD incidence, with adjustment for the effects of other variables and with latitude in generating the plots that permitted departure from a straight-line association. Random forest modeling represents a machine learning technique based on a series of decision tree models.10 Decision trees predict outcomes by recursively partitioning predictor

The Journal of Pain

T65

variables, and these trees are superior to regressionbased models in identifying nonlinear effects and handling missing data. Random forests improve the accuracy of individual decision trees by averaging over a series of decision trees.8 In recent years, random forests have been increasingly applied to classification problems in biomedical research, including predicting several pain-related outcomes.9,22,28 The random forest model was used to compute the expected rate of first-onset TMD that would be observed at several values of the variable after averaging over the values of all other variables in the model. Partial dependence plots were then generated and LOESS smoothing was used to help visualize the association.12 These 2 multivariable strategies were selected in favor of other approaches for several reasons. The first strategy builds on findings from our baseline case-control studies that identified meaningful factors from among the multitude of QST and autonomic measures used in OPPERA.7 The relatively small number of factors meant that all of them could be used in a single Cox regression model, thereby adjusting for potential confounding effects between constructs. This avoided arbitrary choices and potential bias that occur commonly when stepwise procedures and related variable selection methods are used to select a restricted set of potential confounders.6 However, regression using principal component scores can mask effects on TMD incidence of single items that do not fit will within the components. Furthermore, variable selection methods that are needed to exclude variables from a large candidate set of predictors do not provide information about the excluded variables. Also, variable selection regression methods produce P values and CIs that are less reliable.1,6 Thus, a random forest model was used to address these objectives.3,10

Results The cohort of 2,737 initially-TMD-free people was followed for a total of 7,404 person-years (median = 2.8 years per person), during which time 260 people developed first-onset TMD, yielding an annual incidence rate of 3.5%. For those 260 incident cases, there was a range of 0 to 85 days (median, 14 days) in the interval between the quarterly screening questionnaire in which symptoms were reported and the examination where TMD case-classification was determined.

QST Results Based on a univariate analysis, a subset of QST measures showed nominally significant (P < .05) associations with TMD incidence. PPTs at cranial sites showed siteadjusted hazard ratios that ranged from 1.15 (95% CI = 1.00–1.30) to 1.20 (95% CI = 1.05–1.37), signifying relative increases of 15 to 20% in TMD incidence associated with an increase of 1 SD in this measure of pain sensitivity (Table 1). Hazard ratios remained virtually identical after adjustment for demographic characteristics and after imputation for subjects who were not examined as intended. Two measures of mechanical

STIMULUS

IMPUTED ESTIMATES WITH ADJUSTMENT FOR STUDY SITE 1 DEMOGRAPHICSy

N

MEAN (SD) [1ST; 2ND TERCILE]

LOWER

MID

UPPER

STUDY SITEz

STUDY SITE 1 DEMOGRAPHICSx

P VALUEjj

N

HAZARD RATIO (95% CI)

P VALUEk

kPa, 50–600 kPa, 50–600 kPa, 50–600 kPa, 100–600 kPa, 100–600

2,692 2,691 2,657 2,637 2,643

218.7 (85.9) [173.0; 242.2] 201.4 (78.4) [157.7; 222.7] 184.4 (67.4) [148.5; 204.0] 375.6 (146.3) [281.7; 451.5] 391.6 (147.0) [302.7; 481.5]

4.4 4.2 4.0 4.1 3.8

4.0 4.1 3.9 4.3 3.9

3.2 3.1 3.1 3.1 3.5

1.20 (1.05, 1.37) 1.18 (1.03, 1.34) 1.15 (1.00, 1.30) 1.13 (.99, 1.28) 1.04 (.90, 1.19)

1.20 (1.04, 1.37) 1.18 (1.02, 1.35) 1.14 (.99, 1.31) 1.12 (.97, 1.29) 1.02 (.88, 1.17)

.012 .021 .057 .100 .800

2,714 2,714 2,714 2,714 2,714

1.17 (1.01, 1.34) 1.16 (1.01, 1.33) 1.16 (1.00, 1.32) 1.11 (.96, 1.27) 1.05 (.90, 1.21)

.027 .034 .037 .160 .500

mN, 8–512

2,621 260.0 (167.3) [157.6; 337.8]

3.7

3.9

3.4

1.02 (.89, 1.16)

1.06 (.92, 1.20)

.400

2,621 1.12 (.96, 1.28)

.130

Ratings, 0–100 Ratings, 0–100

2,650 2,488

9.0 (13.0) [1.7; 8.2] 14.9 (17.8) [4.0; 15.0]

4.2 3.6

3.2 3.3

3.8 3.9

.99 (.86, 1.12) 1.07 (.94, 1.21)

1.05 (.92, 1.19) 1.11 (.98, 1.25)

.460 .097

2,657 1.07 (.93, 1.21) 2,657 1.13 (.99, 1.27)

.310 .059

Ratings, 0–100 Ratings, 0–100

2,648 2,487

18.5 (20.8) [5.0; 20.0] 31.1 (27.1) [12.0; 38.7]

4.1 3.6

3.5 3.5

3.7 3.9

.95 (.83, 1.08) 1.04 (.91, 1.17)

1.01 (.88, 1.15) 1.08 (.95, 1.23)

.840 .230

2,657 1.05 (.91, 1.19) 2,657 1.13 (1.00, 1.28)

.470 .048

Ratings, 0–100 Ratings, 0–100 Ratings, 0–100 Ratings, 0–100

2,649 2,650 2,482 2,475

– – – –

– – – –

– – – –

1.04 (.91, 1.18) 1.07 (.94, 1.21) 1.09 (.95, 1.25) 1.15 (1.00, 1.31)

1.08 (.95, 1.23) 1.10 (.97, 1.24) 1.12 (.97, 1.29) 1.16 (1.01, 1.32)

.220 .140 .110 .035

2,658 2,658 2,658 2,658

.100 .170 .061 .032

4.2 3.9

3.4 3.0

3.7 4.0

.93 (.81, 1.06) .98 (.86, 1.11)

.97 (.84, 1.11) 1.01 (.88, 1.15)

.670 .840

2,657 1.01 (.88, 1.15) 2,657 1.08 (.95, 1.21)

D Ratings, 6100 2,645 D Ratings, 6100 2,485

.4 (.4) [.0; 1.0] .3 (.4) [.0; .0] .6 (.4) [.0; 1.0] .5 (.5) [.0; 1.0] 9.5 (12.2) [2.2; 9.7] 16.2 (15.8) [6.2; 18.5]

1.12 (.97, 1.27) 1.10 (.96, 1.24) 1.14 (.99, 1.30) 1.16 (1.01, 1.32)

.880 .230

*Adjusted rates computed using Poisson regression controlling for study site (categorical variable, 4 levels). Estimated rate is for reference study site (University at Buffalo). Empty cells in these columns reflect variables for which a tercile is equal to a floor (0) value, thus precluding calculation of incidence rate by tercile. yAs for footnote x, with inclusion of imputed TMD rates for subjects lost to follow-up or with questionable examiner classification. zHazard ratios represent relative increase in rate of TMD associated with an increase of 1 SD in risk factor. Computed using Cox proportional hazard regression model controlling for study site (categorical variable, 4 levels). xAs for footnote z, with additional adjustment for age (in years as a continuous variable), gender (categorical, 2 levels), race/ethnicity (categorical, 5 levels), and lifetime U.S. residence (categorical, 2 levels). kP value evaluating null hypothesis that the hazard ratio equals 1. {Negative z-scores were used for standardized hazard ratios, which therefore represent increase in incidence associated with reduction of 1 SD in threshold.

QST and Autonomic Risk Factors for First-Onset TMD

Pressure pain threshold{ Temporalis Masseter TM joint Trapezius Lateral epicondyle Mechanical cutaneous pain Threshold{ Single stimulus ratings 256-mN probe 512-mN probe Overall ratings of 10 stimuli 256-mN probe 512-mN probe Aftersensation ratings 15 s, 256-mN probe 30 s, 256-mN probe 15 s, 512-mN probe 30 s, 512-mN probe Temporal summation 256-mN probe 512-mN probe

UNITS AND POTENTIAL RANGE

STANDARDIZED HAZARD RATIOS (95% CI) ADJUSTED FOR

The Journal of Pain

DISTRIBUTION OF RISK FACTOR AT ENROLLMENT

SITE-ADJUSTED* TMD INCIDENCE RATE (% OF PEOPLE PER ANNUM) ACCORDING TO TERCILE OF RISK FACTOR

T66

Univariate Associations of Mechanical Pain Measures and Incidence Rates of First-Onset TMD: OPPERA Prospective Cohort Study, 2006–2011

Table 1.

Univariate Associations of Thermal Pain Measures and Incidence Rates of First-Onset TMD: OPPERA Prospective Cohort Study, 2006–2011 DISTRIBUTION OF RISK FACTOR AT ENROLLMENT

STIMULUS

UNITS AND POTENTIAL RANGE

N

MEAN (SD) [1ST; 2ND TERCILE]

STANDARDIZED HAZARD RATIOS (95% CI) ADJUSTED FOR

IMPUTED ESTIMATES WITH ADJUSTMENT FOR STUDY SITE 1 DEMOGRAPHICSy

LOWER

MID

UPPER

STUDY SITEz

STUDY SITE 1 DEMOGRAPHICSx

P VALUEk

3.9 4.2

4.4 4.3

3.7 3.7

1.00 (.86, 1.15) 1.05 (.92, 1.18)

1.00 (.86, 1.16) 1.03 (.91, 1.16)

.97 .620

2,103 1.03 (.88, 1.20) 2,567 1.06 (.92, 1.21)

.720 .410

3.1 3.9 4.2

3.9 4.0 3.8

5.0 4.5 4.2

1.12 (.99, 1.27) 1.05 (.92, 1.19) 1.05 (.92, 1.19)

1.11 (.98, 1.25) 1.04 (.91, 1.17) 1.04 (.91, 1.18)

.094 .590 .550

2,539 1.11 (.97, 1.27) 2,539 1.14 (.99, 1.30) 2,539 1.07 (.93, 1.23)

.120 .062 .320

3.5 2.6 2.2

4.1 3.1 2.9

3.5 3.6 3.1

1.00 (.86, 1.16) 1.10 (.92, 1.29) 1.16 (.94, 1.40)

1.05 (.90, 1.21) 1.14 (.96, 1.34) 1.20 (.98, 1.46)

.530 .130 .076

2,539 1.16 (1.01, 1.32) 2,539 1.22 (1.06, 1.41) 2,539 1.23 (1.05, 1.41)

.033 .006 .006

3.8 3.3 –

3.9 4.1 –

4.6 5.2 –

1.11 (.97, 1.26) 1.12 (.97, 1.28) 1.22 (1.06, 1.39) 1.23 (1.06, 1.41) 1.21 (1.05, 1.39) 1.21 (1.04, 1.40)

.099 .004 .009

2,539 1.16 (1.01, 1.32) 2,539 1.23 (1.06, 1.42) 2,539 1.24 (1.06, 1.44)

.032 .005 .005

– – – – – –

– – – – – –

– – – – – –

1.08 (.94, 1.22) 1.14 (1.00, 1.28) 1.17 (1.01, 1.33) 1.12 (.97, 1.27) 1.13 (.98, 1.29) 1.16 (1.01, 1.33)

1.09 (.95, 1.24) 1.15 (1.00, 1.30) 1.20 (1.04, 1.37) 1.12 (.98, 1.28) 1.14 (.99, 1.31) 1.17 (1.02, 1.34)

.190 .036 .011 .088 .067 .023

2,504 2,504 2,504 2,504 2,504 2,504

1.07 (.93, 1.22) 1.12 (.98, 1.27) 1.17 (1.01, 1.34) 1.11 (.96, 1.26) 1.13 (.98, 1.30) 1.14 (.99, 1.30)

.320 .095 .026 .140 .081 .063

3.7 3.1 2.8

4.1 3.8 2.9

4.1 4.1 4.2

.99 (.86, 1.12) 1.00 (.87, 1.14) 1.12 (.98, 1.28) 1.11 (.96, 1.26) 1.24 (1.07, 1.43) 1.25 (1.07, 1.44)

.980 .130 .003

2,539 1.07 (.94, 1.20) 2,539 1.07 (.94, 1.20) 2,539 1.14 (1.00, 1.28)

.310 .300 .039

4.4 3.2 3.1

3.4 3.4 3.4

4.2 4.6 3.6

.99 (.86, 1.13) .99 (.86, 1.13) 1.12 (.99, 1.26) 1.09 (.97, 1.22) 1.15 (1.01, 1.31) 1.14 (1.00, 1.29)

.860 .140 .050

2,539 1.04 (.91, 1.17) 2,539 1.08 (.96, 1.20) 2,539 1.12 (1.00, 1.26)

.570 .180 .044

N

HAZARD RATIO (95% CI)

P VALUEk

T67

*Adjusted rates computed using Poisson regression controlling for study site (categorical variable, 4 levels). Estimated rate is for reference study site (University at Buffalo). Empty cells in these columns reflect variables for which a tercile is equal to a floor (0) value, thus precluding calculation of incidence rate by tercile. yAs for footnote x, with inclusion of imputed TMD rates for subjects lost to follow-up or with questionable examiner classification. zHazard ratios represent relative increase in rate of TMD associated with an increase of 1 SD in risk factor. Computed using Cox proportional hazard regression model controlling for study site (categorical variable, 4 levels). xAs for footnote z, with additional adjustment for age (in years as a continuous variable), gender (categorical, 2 levels), race/ethnicity (categorical, 5 levels), and lifetime U.S. residence (categorical, 2 levels). kP value evaluating null hypothesis that the hazard ratio equals 1. {Negative z-scores were used for standardized hazard ratios, which therefore represent increase in incidence associated with reduction of 1 SD in threshold.

The Journal of Pain

Heat pain threshold{ C, 32–52 2,103 42.2 (3.1) [41.0; 44.0] Heat pain tolerance{ C, 32–52 2,455 46.2 (2.4) [45.4; 47.2] Single stimulus ratings Rating, 0–100 2,429 36.7 (32.0) [10.0; 50.0] 46 C Rating, 0–100 2,400 45.1 (32.9) [20.0; 60.0] 48 C Rating, 0–100 2,322 57.4 (33.9) [35.0; 80.0] 50 C Ratings of 10 stimuli: area under curve Ratings, 0–1,000 2,030 347.8 (236.4) [185.0; 467.5] 46 C Ratings, 0–1,000 1,701 412.2 (242.2) [255.0; 557.5] 48 C Ratings, 0–1,000 1,258 466.9 (248.7) [330.0; 630.0] 50 C Maximum rating from among 10 stimuli Ratings, 0–1,000 2,534 57.1 (33.7) [35.0; 80.0] 46 C Ratings, 0–1,000 2,509 68.2 (32.2) [53.0; 95.0] 48 C Ratings, 0–1,000 2,420 77.6 (29.5) [75.0; 100.0] 50 C Thermal aftersensations Rating, 0–100 2,498 .5 (.4) [.0; 1.0] 15 s, 46 C Rating, 0–100 2,500 .4 (.4) [.0; 1.0] 30 s, 46 C Rating, 0–100 2,501 .6 (.4) [.0; 1.0] 15 s, 48 C Rating, 0–100 2,502 .5 (.4) [.0; 1.0] 30 s, 48 C Rating, 0–100 2,405 .7 (.4) [.0; 1.0] 15 s, 50 C Rating, 0–100 2,401 .5 (.4) [.0; 1.0] 30 s, 50 C Temporal summation: highest minus first rating D Ratings, 0–100 2,302 20.2 (26.2) [10.0; 30.0] 46 C D Ratings, 0–100 2,207 25.2 (25.8) [10.0; 30.0] 48 C D Ratings, 0–100 1,922 25.5 (25.4) [10.0; 30.0] 50 C Temporal summation: slope of line for first 3 ratings b coefficient 2,304 6.7 (10.2) [2.5; 7.5] 46 C b coefficient 2,211 8.1 (10.9) [2.5; 10.0] 48 C b coefficient 1,922 9.0 (12.1) [2.5; 10.0] 50 C

SITE-ADJUSTED* TMD INCIDENCE RATE (% OF PEOPLE PER ANNUM) ACCORDING TO TERCILE OF RISK FACTOR

Greenspan et al

Table 2.

T68

QST and Autonomic Risk Factors for First-Onset TMD

The Journal of Pain

Standardized Hazard Ratios for Rate of First-Onset TMD From Univariate and Multivariable Models of QST Principal Components: OPPERA Prospective Cohort Study, 2006–2011

Table 3.

MULTIVARIABLEy

UNIVARIATE* PRINCIPAL COMPONENT Heat pain ratings Heat pain aftersensations and tolerance Mechanical cutaneous pain sensitivity Pressure pain thresholds (reverse coded) Heat pain temporal summation

N

PEOPLE

2,737 2,737 2,737 2,737 2,737

HR (95% CL)

P VALUEz

HR (95% CL)

P VALUEz

1.09 (.95, 1.24) 1.06 (.95, 1.18) 1.06 (.93, 1.20) 1.09 (.95, 1.24) 1.11 (.98, 1.25)

.180 .290 .370 .180 .088

1.12 (.98, 1.28) 1.06 (.95, 1.18) 1.02 (.90, 1.16) 1.14 (1.00, 1.31) 1.06 (.94, 1.19)

.101 .292 .736 .051 .357

Abbreviation: CL, confidence limit. *Hazard ratios (95% confidence limits) from separate Cox models for each individual principal component. Each model adjust for study site (3 dummy variable), age (in years as a continuous variable), gender (categorical, 2 levels), race/ethnicity (categorical, 5 levels), and lifetime U.S. residence (categorical, 2 levels). Models use imputed TMD rates for subjects lost to follow-up or with questionable examiner classification. yHazard ratios (95% confidence limits) from 1 multivariable Cox model of all 5 principal components. The model additionally adjust for study site (3 dummy variable), age (in years as a continuous variable), gender (categorical, 2 levels), race/ethnicity (categorical, 5 levels), and lifetime U.S. residence (categorical, 2 levels). Models use imputed TMD rates for subjects lost to follow-up or with questionable examiner classification. zP value evaluating null hypothesis that the hazard ratio equals 1.

cutaneous pain—overall ratings of 10 stimuli and 30-second aftersensation ratings with the 512-mN probe—were also nominally significant predictors of TMD incidence. The imputed and fully adjusted hazard ratios were 1.13 for overall ratings of 10 stimuli, and 1.16 for aftersensation pain ratings. Other responses to mechanical stimuli had hazard ratio 95% CIs that crossed the value of 1.00, and were thus considered statistically nonsignificant (Table 1). Among the heat pain measures, 9 variables demonstrated significant site- and demographic-adjusted hazard ratios when imputed data were included in the analysis: 1) to 3) ratings of the series of 10 heat stimuli (area under the curve) for all 3 temperatures; 4) to 6) maximum pain rating among the 10 stimuli for all 3 temperatures; 7) aftersensation pain rating 15 seconds following a train of 48 C stimuli; 8) heat pain temporal summation at 50 C, based on the change of ratings between the first pulse and the highest rated pulse in the series; and 9) heat pain temporal summation at 50 C, based on the slope derived from the ratings of the first 3 heat pulses (Table 2). Hazard ratios for these variables ranged from 1.12 (95% CI = 1.00–1.26) to 1.24 (95% CI = 1.06–1.44), signifying relative increases of 12 to 24% in TMD incidence associated with an increase of 1 SD of the respective measure (Table 2). Components derived from a PCA analysis of QST variables7 were also evaluated for associations with TMD incidence (Table 3). When assessed individually, none of the 5 factors produced significant hazard ratios. Multivariable analysis of these components revealed a marginally significant hazard ratio of 1.14 (95% CI = 1.00–1.31) for the component related to PPTs on TMD incidence. Hazard ratios for QST PCA components did not vary significantly among demographic groups (Supplementary Table 1). However, when pairs of QST components were evaluated for potential interactions, the effect of heat pain temporal summation varied significantly across 3 levels of heat pain ratings (Supplementary Table 2). Specifically, the hazard ratio for the heat pain temporal summation principal component was 1.54 (95% CI = 1.21–1.94) in

participants with the lowest tercile of the heat pain ratings principal component. In other terciles, the hazard ratio rounded to the null value of 1.0. The nature of the interaction is depicted in Fig 1 using rates from an equivalent Poisson regression model. The highest incidence rate was seen among subjects in the lowest tercile of the heat pain rating

Figure 1. Imputed TMD incidence rates and hazard ratios for heat pain temporal summation stratified to tercile of heat pain ratings. Incidence rates of first-onset TMD were computed using multivariable Poisson regression models with multiple imputation to account for subjects who were not examined as intended. Covariates were study site (categorical variable, 4 levels), age (in years, with rates estimated for 3 selected age groups: 20, 30, and 40 years), race/ethnicity (5 categories: rates not shown for ‘‘other/unknown’’ race category), and lifetime U.S. residence (2 categories), heat pain rating principal component, heat pain temporal summation principal component, and the interaction of the 2 principal components. Hazard ratios (HR) represent effect of 1 SD increase in the degree of temporal summation on TMD incidence. They were calculated using multivariable Cox regression models with the same covariates described for the Poisson models.

Greenspan et al component who had above average (11 SD) temporal summation component scores. In comparison, the rate was less than half among subjects within that same tercile who had below average (–1 SD) temporal summation component scores. In other terciles of the heat pain ratings component, temporal summation component scores were not related to incidence rates (Fig 1). This same pattern was observed in stratified analysis of individual thermal measures. Specifically, when analysis was restricted to those subjects who gave a rating between 0 and 20 for the first heat pain stimulus in a series, there were statistically significant hazard ratios for heat pain temporal summation for both 48 C and 50 C stimuli. In contrast, when the first heat pain rating was above 20, hazard ratios based on temporal summation values were not statistically significant (Table 4). The random forest model provided numeric and graphic results indicating the association of any single QST variable with TMD incidence, independent of the influence of other QST and demographic variables. In general, the QST variables showed weak or no association with TMD incidence, similar to the results of the univariate analysis described above. However, the graphic results provided for an evaluation of the extent of linear and nonlinear effects, the latter of which was not obtainable from the previous analyses. Within the mechanical domains, PPT for the temporomandibular joint and masseter muscle sites (Figs 2A and 2B) showed a similar pattern, in that below a certain threshold level, PPTs had a monotonic relationship to TMD incidence (ie, decreasing thresholds becoming more likely to indicate incident TMD cases), whereas above that level, there was relatively low incidence, which did not vary according to PPT value. The PPT for the epicondyle site (Fig 2C) showed a weak trend of greater TMD incidence with lower thresholds, but the confidence error range indicates a nonsignificant effect overall. The univariate analysis for epicondyle PPT also failed to show a significant hazard ratio (Table 1). Considering the mechanical cutaneous pain measures, the threshold showed a similar profile as epicondyle PPT (Fig 2D) and was also nonsignificant in the univariate analysis (Table 1). Both the mechanical pain temporal summation (Fig 2E) and aftersensation (Fig 2F) associated with the most intense cutaneous stimulus demonstrated a largely linear relationship with TMD incidence. Among the 3 heat pain measures depicted, each shows a different pattern, including a largely linear function (Fig 2H), and 2 different nonlinear patterns (Figs 2G and 2I).

Autonomic Results Among the autonomic measures, only electrocardiograph-derived heart rate, which was determined on a beat-by-beat basis during the 20-minute rest period, showed a nominally significant hazard ratio (hazard ratio = 1.01, P = .023; Supplementary Table 3), with higher resting heart rate associated with a greater incidence of TMD. When considering analysis of hazard ratios based on site- and demographically adjusted data, and including imputed estimates, none were statis-

The Journal of Pain

T69

tically significant for autonomic responses to orthostatic challenge (Supplementary Table 3) or for the Stroop procedures (Supplementary Table 4). Three measures within the Stroop color-word protocol had marginal statistical significance based on unimputed data analysis (Supplementary Table 4). These results were considered spurious, given both the marginal level of significance and the lack of any apparent link among these 3 measures. Components derived from a PCA analysis of autonomic variables18 were also evaluated in terms of hazard ratios for TMD onset (Supplementary Table 5). None of the 5 components produced a significant hazard ratio, based on either univariate or multivariable analyses. Finally, an analysis of the influence of demographic factors (age, gender, and race/ethnicity) failed to show any significant effects upon hazard ratios for any of the autonomic PCA components (Supplementary Table 6).

Discussion In this prospective study of first-onset TMD, the following key findings emerged: 1. Some, but not all, measures of pain sensitivity assessed at study entry were associated with increased risk of developing TMD. The measures associated with increased TMD incidence were 1) lower PPTs measured at cranial sites, 2) greater pain ratings for a series of either mechanical or heat stimuli, 3) greater maximum pain ratings of heat stimuli, 4) greater temporal summation of heat pain at 50 C, and 5) greater aftersensation ratings associated with either mechanical cutaneous or heat pain stimuli. 2. When individual QST measures were subjected to PCA, none of the 5 derived components was a significant univariate predictor of TMD incidence, either for the overall cohort or in demographic subgroups. However, there was evidence of an interaction between the 2 principal components assessing heat pain responses: greater temporal summation of heat pain was associated with increased incidence of TMD among people with relatively low ratings of single-heat stimuli, but not for people who had moderate or high ratings of single-heat stimuli. 3. Multivariable random forest models suggested that the relationship between QST variables and TMD incidence can be linear or nonlinear, depending on the specific pain sensitivity measure. 4. In contrast to the QST results, measures of autonomic function at study entry were largely not associated with incidence of first-onset TMD.

QST Measures as Predictors of TMD Incidence Several studies,7,16,17,23 including the OPPERA baseline case-control study,5 have reported that people with chronic TMD show enhanced sensitivity to experimental pain, even at nonsymptomatic body sites. However, those studies do not enable assessment of the causal

T70

DISTRIBUTION OF RISK FACTOR AT ENROLLMENT

STIMULUS

UNITS AND POTENTIAL RANGE

Temporal summation: highest minus first rating D Ratings, 0–100 46 C D Ratings, 0–100 48 C D Ratings, 0–100 50 C

N

963 938 870

D Ratings, 0–100 D Ratings, 0–100 D Ratings, 0–100

928 890 786

46 C 48 C 50 C

D Ratings, 0–100 D Ratings, 0–100 D Ratings, 0–100

411 379 266

LOWER

MID

Subjects with a first-pulse rating of 0–19 19.4 (23.9) [5.0; 20.0] 2.3 3.7 26.2 (26.1) [10.0; 30.0] 2.7 2.7 28.5 (26.8) [10.0; 40.0] 2.3 2.2 Subjects with a first-pulse rating of 20–59 27.5 (26.2) [15.0; 40.0] 4.7 3.8 29.3 (25.0) [20.0; 40.0] 3.5 3.7 27.0 (23.7) [15.0; 35.0] 3.1 4.2 Subjects with a first-pulse rating of 60–100 5.6 (24.8) [5.0; 20.0] 11.3 8.4 12.8 (23.0) [9.0; 20.0] 6.2 9.2 11.0 (19.9) [5.0; 15.0] 5.2 3.7

IMPUTED ESTIMATES WITH ADJUSTMENT FOR STUDY SITE 1 DEMOGRAPHICSy

STANDARDIZED HAZARD RATIOS (95% CI) ADJUSTED FOR

UPPER

STUDY SITEz

STUDY SITE 1 DEMOGRAPHICSx

P VALUEk

N

HAZARD RATIO (95% CI)

P VALUEk

3.4 3.6 4.5

1.14 (.93, 1.39) 1.25 (1.03, 1.50) 1.35 (1.11, 1.63)

1.14 (.92, 1.38) 1.23 (1.01, 1.48) 1.33 (1.10, 1.61)

.210 .039 .003

964 964 964

1.18 (.96, 1.42) 1.15 (.94, 1.38) 1.27 (1.06, 1.52)

.100 .170 .009

4.0 4.5 4.1

1.04 (.82, 1.30) 1.15 (.90, 1.47) 1.19 (.91, 1.54)

1.07 (.84, 1.34) 1.16 (.90, 1.47) 1.24 (.94, 1.61)

.580 .230 .120

934 934 934

1.15 (.90, 1.46) 1.13 (.90, 1.40) 1.21 (.96, 1.51)

.240 .300 .096

6.7 7.9 5.8

.83 (.63, 1.08) .91 (.63, 1.30) .94 (.55, 1.58)

.89 (.68, 1.16) .90 (.62, 1.28) .94 (.56, 1.55)

.420 .560 .800

641 641 641

.96 (.72, 1.27) 1.05 (.78, 1.40) 1.00 (.68, 1.44)

.770 .740 .990

*Adjusted rates computed using Poisson regression controlling for study site (categorical variable, 4 levels). Estimated rate is for reference study site (University at Buffalo). yAs for footnote x, with inclusion of imputed TMD rates for subjects lost to follow-up or with questionable examiner classification. zHazard ratios represent relative increase in rate of TMD associated with an increase of 1 SD in risk factor. Computed using Cox proportional hazard regression model controlling for study site (categorical variable, 4 levels). xAs for footnote z, with additional adjustment for age (in years as a continuous variable), gender (categorical, 2 levels), race/ethnicity (categorical, 5 levels), and lifetime U.S. residence (categorical, 2 levels). kP value evaluating null hypothesis that the hazard ratio equals 1.

QST and Autonomic Risk Factors for First-Onset TMD

46 C 48 C 50 C

MEAN (SD) [1ST; 2ND TERCILE]

SITE-ADJUSTED* TMD INCIDENCE RATE (% OF PEOPLE PER ANNUM) ACCORDING TO TERCILE OF RISK FACTOR

The Journal of Pain

Table 4. Univariate Associations of Thermal Pain Temporal Summation and Incidence Rates of First-Onset TMD, Stratified According to FirstPulse Response: OPPERA Prospective Cohort Study, 2006–2011

Greenspan et al

The Journal of Pain

T71

Figure 2. Partial dependence plots for several of the QST measures. The plots depict the estimated TMD incidence rate that would be observed at several values of the variable after averaging over the values of all other variables in the model. TMD incidence rates, expressed as cases per 100 person-years, were generated from random forest models that predicted TMD onset using study site, demographic variables, and QST variables presented in Table 1. Predicted values (C) are plotted together with LOESS-smoothed estimates (- - -) and their 95% CIs (.....). relationship between pain sensitivity and the clinical pain. It has been proposed that a more ‘‘pain sensitive’’ individual is more susceptible to developing a clinical pain condition, considering that a lesser degree of provocation could lead to such a condition.4 Indeed, 1 prospective TMD study showed that greater experimental pain sensitivity was associated with a higher TMD incidence rate.25 Alternatively, development of a chronic pain condition could alter a person’s nociceptive processing system, leading to an amplified representation of nociceptive signals within the central nervous system, frequently referred to as central sensitization.29 The results of this study, along with those from previous studies evaluating chronic TMD cases, suggest that both processes occur. Fourteen of the 39 QST variables evaluated in this study showed statistically significant associations with

TMD incidence. This was true for variables measured at cranial sites that eventually became symptomatic, as well as those measured at other body sites. One may expect that PPTs derived from the cranial sites were most likely to show significant hazard ratios, given that palpation pain from these same sites are part of the diagnostic criteria for TMD. Indeed this was found to be true, although several other pain measures derived from the forearm also provided significant hazard ratios. These observations support the theory that higher premorbid pain sensitivity contributes to risk of developing clinical pain, even when pain sensitivity is assessed beyond the site of clinical pain. Another prospective study evaluating predictors of factory work-related musculoskeletal pain also found that premorbid pressure pain sensitivity was significantly greater for individuals who developed

T72

The Journal of Pain

such pain conditions versus those in the same environment who did not.13 Also in line with the current study, Madeleine et al found such significant differences only for a subset of experimental pain measures.13 Thus, enhanced premorbid pain sensitivity appears to be somewhat selective, rather than global. Yet, based on the current study’s results, the selectivity is not strictly based on anatomic location, stimulus modality, or specific types of pain measures. Those QST variables showing statistically significant associations with TMD incidence showed marginal effect sizes, with hazard ratios ranging from 1.12 to 1.24. Only 4 of these hazard ratios were statistically significant at the P < .01 level, and none were significant at the Bonferroni-adjusted criterion of P < .00128. These results contrast dramatically with the OPPERA baseline casecontrol study of chronic TMD, which used these same pain sensitivity measures.7 For example, nearly all QST measures had highly significant associations, as signified by P values less than .001. For each measure of pain sensitivity, hazard ratios in this prospective cohort study were uniformly lower than corresponding odds ratios in the case-control study, although hazard ratios and odds ratios cannot be directly compared. Nonetheless, these distinct and sizable differences in strength of association of pain sensitivity with chronic TMD versus first-onset TMD support the theory that the induction of chronic pain further enhances a person’s overall pain sensitivity. Furthermore, these findings suggest the possibility that the biological mechanisms contributing to the enhanced pain sensitivity with chronicity may be the same critical processes that contribute, at least in part, to the transition from acute to chronic TMD. None of the QST PCA components individually revealed a statistically significant hazard ratio. This is consistent with the observation that only a minority of QST measures showed statistically significant hazard ratios in univariate analyses, and these significant measures were not concentrated within a single PCA component. However, those mechanical cutaneous and heat pain measures that did produce significant hazard ratios could be characterized as involving ratings of 1) multiply applied stimuli and 2) the more intense stimuli. Unexpectedly, an interaction between 2 PCA components revealed a significant hazard ratio: for those cases in which ratings of heat pain intensity were lower, greater temporal summation of heat pain was predictive of TMD incidence. Specifically, for the lowest tercile of the PCA component related to heat pain sensitivity, the PCA component related to temporal summation produced a hazard ratio of 1.6—the highest seen for any QST measure. This result does not correspond with any previously described model. Enhanced heat pain temporal summation was not generally found for chronic TMD cases in our previous report.7 However, a more detailed analysis of those results demonstrates that heat pain temporal summation can distinguish TMD cases from controls for instances in which initial stimuli are reported as only mildly painful (Rothwell et al, in prep.). These 2 congruent sets of observations show that while all groups demonstrate some degree of heat pain temporal

QST and Autonomic Risk Factors for First-Onset TMD summation, differences between healthy controls and either incipient or chronic TMD cases are only demonstrable when an initial stimulus is rated as weakly painful. In that situation, a large degree of temporal summation is possible. When initial stimuli are rated as more strongly painful, the range of possible values for temporal summation is less, and it is more similar between groups. This may be due to a ceiling effect in heat pain ratings per se (eg, reporting 100 routinely), although analysis of chronic TMD case-control data would suggest otherwise (Rothwell et al, in prep). Instead, it may reflect a greater ability for healthy controls to restrict the degree of temporal summation to a certain range, which is only evident when initial pain intensity is low and a large range of temporal summation is possible. More exploration of this topic is clearly warranted. Another factor that demands consideration is that only some of the incident TMD cases in this study are likely to become chronic TMD cases. Based on other studies,5,20 one would predict that less than half of the incident cases will become chronic TMD cases. If premorbid experimental pain sensitivity is related only to the development of chronic TMD, then this relationship could be obscured in the current analysis, as we cannot differentiate between those incident cases that will become chronic versus those that will resolve and thus only be considered as acute cases. If the analyses were limited to those who develop chronic TMD, one would predict that a closer relationship between the incident hazard ratios and case-control odds ratios would emerge. Furthermore, as noted in a companion paper,24 the annual incidence of 3.5% observed here exceeds the rate reported in most other prospective cohort studies of TMD, although it is similar to the rate found in 1 prospective cohort study of young women.21

Autonomic Measures as Predictors of TMD Incidence One of the autonomic measures produced a marginally significant hazard ratio for TMD incidence: heart rate during rest. This measure also produced a significant odds ratio of 1.3 for chronic TMD in the OPPERA casecontrol study.18 The observation that increased heart rate is associated with increased incidence of TMD is consistent with the hypothesis that augmented sympathetic activity, which results from a central impairment in baroreceptor function and is reflected by an increase in resting heart rate, contributes to the onset and chronicity of musculoskeletal pain conditions such as TMD and fibromyalgia.14 Furthermore, a systematic increase in the incidence rate was observed from the lowest to the highest tercile of resting heart rate. Thus, it may be that individuals in the highest tercile are more likely to develop chronic TMD compared to individuals in the lower 2 tertiles.11,27 It should also be appreciated that the current level of analyses does not fully account for the potential importance of autonomic variables in predicting the onset of TMD. More sophisticated

Greenspan et al multivariable analyses that incorporate variables across several domains (eg, clinical, psychosocial, and genetic) are needed prior to concluding that measures of autonomic function fail to predict and to contribute to the incidence of TMD.

Conclusions This study’s findings should be interpreted in light of a few limitations. First, multiple analyses were performed on a large number of variables, yet we presented and interpreted the results individually, making no formal adjustments for multiple testing. Accordingly, hazard ratio statistics are presented in as full a fashion as possible, for both univariate and multivariable analyses, so readers can review and interpret these results. Second, although we examined some of the potential interactions among factors (eg, demographic measures and PCA components), many other possible interactions were not tested (eg, among individual QST measures) because of the exceedingly large number of statistical tests this would require. Third, the current analyses are restricted to the QST and autonomic data and do not examine potentially important associations and interactions between these 2 sets of data, or among other phenotypic measures. However, cross-domain analyses presented elsewhere in this issue explore TMD incidence risk in a fuller context.24 Finally, the current analyses examine only pain sensitivity and autonomic risk factors for first-onset TMD, and do not speak to predictors of transition from acute to chronic TMD, which is a

References 1. Adams J: A computer experiment to evaluate regression strategies. Proceedings of the American Statistical Association Section on Statistical Computing, 1991 2. Bair E, Brownstein NC, Ohrbach R, Greenspan JD, Dubner R, Fillingim RB, Maixner W, Smith SB, Diatchenko L, Gonzalez Y, Gordon SM, Lim P-F, RibeiroDasilva M, Dampier D, Knott C, Slade GD: Study protocol, sample characteristics, and loss to follow-up: The OPPERA prospective cohort study. J Pain 14:T2-T19, 2013 3. Breiman L: Random forests. Machine Learn 45:5-32, 2001 4. Diatchenko L, Nackley AG, Slade GD, Fillingim RB, Maixner W: Idiopathic pain disorders—Pathways of vulnerability. Pain 123:226-230, 2006 5. Garofalo JP, Gatchel RJ, Wesley AL, Ellis E III: Predicting chronicity in acute temporomandibular joint disorders using the research diagnostic criteria. J Am Dent Assoc 129: 438-447, 1998 6. Greenland S: Invited commentary: Variable selection versus shrinkage in the control of multiple confounders. Am J Epidemiol 167:523-529, 2008 7. Greenspan JD, Slade GD, Bair E, Dubner R, Fillingim RB, Ohrbach R, Knott C, Mulkey F, Rothwell R, Maixner W: Pain sensitivity risk factors for chronic TMD: Descriptive data and empirically identified domains from the OPPERA case control study. J Pain 12: T61-T74, 2011

The Journal of Pain

T73

question of high clinical and mechanistic significance. Such analyses are planned for future papers from this project. These limitations notwithstanding, the current findings reveal that enhanced pain sensitivity is a risk factor for first-onset TMD, although demonstrable in only a subset of pain measures. In general, QST-derived risk factors predicted TMD incidence similarly across demographic categories. In contrast, cardiovascular autonomic measures provided little evidence as risk factors for TMD incidence. Future analyses will examine associations of these variables with transition from acute to chronic pain, and changes in both pain sensitivity and autonomic function that accompany onset of TMD will also be explored.

Acknowledgments The authors would like to thank the OPPERA research staff at each of 4 study sites and the data coordination center for their invaluable contributions to this work. In addition, we express our gratitude to the participants who have devoted time and effort in support of this research.

Supplementary Data Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/ j.jpain.2013.06.007.

8. Hastie T, Tibshirani R, Friedman J: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, Springer, 2009 9. Hu YJ, Ku TH, Jan RH, Wang K, Tseng YC, Yang SF: Decision tree-based learning to predict patient controlled analgesia consumption and readjustment. BMC Med Inform Decis Mak 12:131, 2012 10. Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS: Random survival forests. Ann Appl Stat 2:841-860, 2008 11. Light KC, Bragdon EE, Grewen KM, Brownley KA, Girdler SS, Maixner W: Adrenergic dysregulation and pain with and without acute beta-blockade in women with fibromyalgia and temporomandibular disorder. J Pain 10: 542-552, 2009 12. Loader C: Local Regression and Likelihood. New York, Springer, 1999 13. Madeleine P, Lundager B, Voigt M, Arendt-Nielsen L: The effects of neck-shoulder pain development on sensorymotor interactions among female workers in the poultry and fish industries. A prospective study. Int Arch Occup Environ Health 76:39-49, 2003 14. Maixner W: Biopsychosocial and genetic risk factors for temporomandibular joint disorders and related conditions, in Graven-Nielsen T, Arendt-Nielsen L, Mense S (eds): Fundamentals of Musculoskeletal Pain. Seattle, WA, IASP Press, 2008, pp 263-279 15. Maixner W: Temporomandibular joint disorders, in Mayer EA, Bushnell MC (eds): Functional Pain Syndromes:

T74

The Journal of Pain

Presentation and Pathophysiology. Seattle, IASP Press, 2009, pp 55-70 16. Maixner W, Fillingim R, Booker D, Sigurdsson A: Sensitivity of patients with painful temporomandibular disorders to experimentally evoked pain. Pain 63:341-351, 1995 17. Maixner W, Fillingim R, Sigurdsson A, Kincaid S, Silva S: Sensitivity of patients with painful temporomandibular disorders to experimentally evoked pain: evidence for altered temporal summation of pain. Pain 76:71-81, 1998 18. Maixner W, Greenspan JD, Dubner R, Bair E, Mulkey F, Miller V, Knott C, Slade GD, Ohrbach R, Diatchenko L, Fillingim RB: Potential autonomic risk factors for chronic TMD: descriptive data and empirically identified domains from the OPPERA case-control study. J Pain 12:T75-T91, 2011 19. Ohrbach R, Bair E, Fillingim RB, Gonzalez Y, Gordon SM, Lim P-F, Ribeiro-Dasilva M, Diatchenko L, Dubner R, Greenspan JD, Knott C, Maixner M, Smith SB, Slade GD: Clinical orofacial characteristics associated with risk of firstonset TMD: The OPPERA prospective cohort study. J Pain 14:T33-T50, 2013 20. Ohrbach R, Dworkin SF: Five-year outcomes in TMD: relationship of changes in pain to changes in physical and psychological variables. Pain 74:315-326, 1998 21. Plesh O, Gansky SA, Curtis DA: Chronic pain in a biracial cohort of young women. Open Pain J 5:24-31, 2012 22. Riddle DL, Kong X, Jiranek WA: Two-year incidence and predictors of future knee arthroplasty in persons with symptomatic knee osteoarthritis: preliminary analysis of longitudinal data from the osteoarthritis initiative. Knee 16: 494-500, 2009

QST and Autonomic Risk Factors for First-Onset TMD 23. Sarlani E, Grace EG, Reynolds MA, Greenspan JD: Evidence for up-regulated central nociceptive processing in patients with masticatory myofascial pain. J Orofac Pain 18: 41-55, 2004 24. Slade GD, Bair E, Greenspan JD, Dubner R, Fillingim RB, Diatchenko L, Maixner W, Knott C, Ohrbach R: Signs and symptoms of first-onset TMD and sociodemographic predictors of its development: The OPPERA prospective cohort study. J Pain 14:T20-T32.e3, 2013 25. Slade GD, Diatchenko L, Bhalang K, Sigurdsson A, Fillingim RB, Belfer I, Max MB, Goldman D, Maixner W: Influence of psychological factors on risk of temporomandibular disorders. J Dent Res 86:1120-1125, 2007 26. Svensson P, List T, Hector G: Analysis of stimulus-evoked pain in patients with myofascial temporomandibular pain disorders. Pain 92:399-409, 2001 27. Tchivileva IE, Lim PF, Smith SB, Slade GD, Diatchenko L, McLean SA, Maixner W: Effect of catechol-O-methyltransferase polymorphism on response to propranolol therapy in chronic musculoskeletal pain: a randomized, double-blind, placebo-controlled, crossover pilot study. Pharmacogenet Genomics 20:239-248, 2010 28. Wolfe F, Clauw DJ, Fitzcharles MA, Goldenberg DL, Katz RS, Mease P, Russell AS, Russell IJ, Winfield JB, Yunus MB: The American College of Rheumatology preliminary diagnostic criteria for fibromyalgia and measurement of symptom severity. Arthritis Care Res (Hoboken) 62: 600-610, 2010 29. Woolf CJ: Central sensitization: implications for the diagnosis and treatment of pain. Pain 152:S2-S15, 2011

Greenspan et al

T74.e1

Supplementary Appendix

Mechanical Cutaneous (Pinprick) Pain

The Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) study is a prospective cohort study designed to investigate the etiology of first-onset TMD. Institutional review boards at each study site approved study procedures, and participants provided signed, informed consent. Full details of enrollment, follow-up, and statistical analyses are provided elsewhere in this issue.1 Aspects of specific importance to this paper are elaborated below.

Pricking pain sensitivity was assessed using a set of weighted probes, manufactured locally, matching those described by the German Neuropathic Pain Network.11 Stimuli were applied to the dorsum of the digits 2 to 4. Measures included pain threshold, ratings of pain intensity in response to the 2 largest stimulus intensities (256 mN and 512 mN), temporal summation of pain with these same probes, and ratings of aftersensations following the temporal summation protocol. Threshold was determined using a standard single staircase protocol. After threshold determination, suprathreshold cutaneous mechanical pain sensitivity was assessed using a protocol similar to that of the German Neuropathic Pain Network.11 Participants judged the pain intensity evoked by suprathreshold stimuli, verbally reporting a number between 0 and 100, without a visual reference. Participants were instructed that 0 represented no pain, whereas 100 represented the most intense pain imaginable. Participants reported pain intensity after a single stimulus (applied for approximately .5 seconds), and then again after a series of 10 stimuli was applied at 1-second intervals. For the series of 10 stimuli, participants were asked to report an overall pain intensity for the series of stimuli. At 15 and 30 seconds after the series of 10 stimuli was administered, participants were asked to rate the pain intensity of any residual sensation at the stimulated finger. In the course of this testing, if the participant reported 100, the protocol was halted. The participant was then offered the option of continuing with the next series, or omitting the rest of this set of tests. The participant was also instructed that she/he could stop testing at any time by telling the examiner to stop.

Recruitment, Eligibility Criteria, and Enrollment Between May 2006 and November 2008, potential study participants were recruited using advertisements, e-mails, and flyers at 4 U.S. study sites: Baltimore, MD; Buffalo, NY; Chapel Hill, NC; and Gainesville, FL. Eligibility criteria were age 18 to 44 years, good health, no history of facial injury or surgery, no significant symptoms of TMD pain, no previous diagnosis of TMD, and an absence of TMD myalgia and TMD arthralgia on clinical examination. On enrollment, participants completed a telephone interview and self-administered questionnaires assessing hypothesized risk factors for TMD. During a 3-hour clinical visit, autonomic function was monitored and quantitative sensory tests measured sensitivity to painful stimuli. Study examiners recorded clinical characteristics of muscles and joints of the head, neck, and body and they verified absence of TMD.

Baseline Measurements of Pain Sensitivity: Quantitative Sensory Testing QST was conducted in 3 sensory domains, in the following order: pressure pain, mechanical cutaneous (pricking) pain, and heat pain. These protocols are the same as those used previously.6

Pressure Pain PPTs were assessed using a commercially available pres€ rby, Sweden). Five body sites sure algometer (Somedic; Ho were tested, bilaterally, in the following order: 1) the center of the temporalis muscle, 2) the center of the masseter muscle, 3) overlying the temporomandibular joint, 4) the center of the trapezius muscle, and 5) overlying the lateral epicondyle. The protocol involved manual application of the algometer, with which the examiner would increase pressure at a steady rate (30 kPa/s) until the participant indicated first pain sensation by pressing a button. If no response was given at the point the stimulus reached 600 kPa, a value of 600 was used as the threshold value. The examiners from each of the 4 study sites participated in a reliability exercise at an early stage of this project. PPTs were assessed on a group of 16 subjects (not part of the main study), each of whom was tested by 2 examiners at each of the 5 body sites. The overall intraclass correlation was .91, ranging from .87 to .94. We considered this an acceptable level of interexaminer reliability.

Heat Pain Heat pain sensitivity was assessed using a commercially available thermal stimulator (Pathway; Medoc, Ramat Yishai, Israel). Stimuli were applied on the ventral forearm. Heat pain measures included threshold, tolerance, ratings of single suprathreshold stimuli, temporal summation of heat pain, and aftersensations following the temporal summation protocol. Heat pain threshold was determined using a protocol similar to that for PPT. The MEDOC ATS thermode (2.56 cm2) was manually placed in contact with the skin on the right ventral forearm at a temperature of 32 C. After a few seconds, the temperature increased at a rate of .5 C/s until the participant pushed a button indicating she/he just then felt a pain sensation. The temperature of the thermode at the time of the button press was recorded as a threshold estimate. This was repeated 4 times, moving the thermode to a new site on the forearm each time. Following this, pain tolerance was estimated using the same protocol. The sole difference was that the participant was instructed to press the button when she/he could no longer tolerate the pain. This was repeated 4 times, moving the thermode for each trial. For both threshold and tolerance testing, a ceiling temperature was set at 52 C, which was entered as the threshold or tolerance estimate if the participant failed to press the button on a given trial.

T74.e2 Following heat pain tolerance testing, participants judged the pain intensity evoked by suprathreshold heat stimuli applied to the left ventral forearm, verbally reporting a number between 0 and 100. As with pricking pain ratings, participants were instructed that 0 represented no pain, whereas 100 represented the most intense pain imaginable. Participants were told that they would receive 10 heat stimuli in a row and would be verbally cued to report their peak pain intensity after each stimulus. Practice trials were administered to provide the participant a sense of the timing of stimulus delivery, and to verify understanding of the protocol. The MEDOC CHEPS thermode (5.73 cm2) was manually placed on the skin at a temperature of 38 C, and then a series of 10 temperature pulses was given at 2.4- to 2.5-second interstimulus intervals. For the first series of stimuli, the peak temperature was 46 C, with a ramp rate of 20 C/s and a hold time of 750 ms at the peak temperature. The participant was cued to report when the temperature just started to decline after reaching the peak. At intervals of 15 and 30 seconds after the 10th thermal pulse, the participant was asked to rate the pain intensity of any lingering sensation using the same 0 to 100 scale. Following this, the thermode was moved to another location on the forearm, and the same protocol was conducted with a peak temperature of 48 C. This was followed by another test series with a 50 C peak temperature at a third location. In the course of the testing, if the participant reported 100, the thermode was removed and that series of heat stimuli was halted. The participant was then offered the option of continuing with the next series, or omitting the rest of this set of tests. The participant was also instructed that she/he could stop testing at any time by telling the examiner to stop. An inadvertent protocol variation at one of the study sites—using a different baseline temperature (38 C) for heat pain threshold and tolerance protocols—led to separate evaluation of these data. Comparison of threshold data from the one site versus the others revealed a clear difference in the distributions, indicating that the values from the one site should not be combined with the other sites’ data, and they were thus excluded from analysis. Comparison of tolerance data showed little systematic difference among the sites, indicating that these data could be combined without a bias imposed by the procedural variation.

Missing Data Imputation for QST Data Imputation for missing values used an expectationmaximization method (as described in the Methods paper by Bair et al1) that finds maximum likelihood estimates using a parametric model for incomplete data. For most summary measures, the criteria for imputation on QST items were based on having at least 50% of the data for any type of measure for a given participant. Specifically, data were imputed for individuals with at least 8 valid values among the 16 measurements from the 256mN and 512-mN probes on single stimulus and series of 10 stimulus measurements, whereas data are not

QST and Autonomic Risk Factors for First-Onset TMD imputed for individuals with fewer than 8 valid values. Similarly, individuals with at least 8 of the 16 aftersensation measurements at 15 and 30 seconds for 4 trials each were included in imputation for poststimulus ratings. For PPT data, imputation was conducted for individuals with at least 10 of the 20 measures on the 5 body —temporalis, masseter, temporomandibular joint, trapezius, and lateral epicondyle—each measured twice bilaterally. For thermal measures, data were imputed for individuals with at least 4 of the combined 8 threshold and tolerance ratings from temperatures 1 to 4. Similarly, imputation was done for individuals with at least 3 of the 6 aftersensation measurements at both 15 and 30 seconds from temperatures 46, 48, and 50 C. An exception to this 50% criterion was made for thermal ratings, whereby data were imputed for individuals with at least 1 of the 30 measurements from the thermal trains at 46, 48, and 50 C. The rationale for this criterion stems from the data collection protocol, which required a halt in the procedure when participants gave a rating of 100, with the option to continue at subsequent temperatures. Thus, an individual with only 1 rating at 46 C is informative, and exclusion would certainly bias results.

Baseline Measurement of Autonomic Function Autonomic Profiling Protocols During the 3-hour baseline clinic visit, autonomic profiles were measured for all participants in 5 periods: 1) a 20-minute rest period that followed the clinical examination; 2) a 5-minute orthostatic challenge period that followed Period 1; 3) a 10-minute rest period that followed assessments of thermal and mechanical pain sensitivity; 4) a period of approximately 5 minutes following Period 3, during which the traditional Stroop color-word protocol was administered; and 5) a period of approximately 5 minutes following Period 4, during which the Stroop pain-affect test was administered. At the beginning of the first period, a blood pressure (BP) cuff was placed on the upper left arm and was inflated with a Datascope Accutorr Plus blood pressure monitor (Datascope Inc, Mahwah, NJ). Heart rate variability (HRV) was assessed with a 3-lead BioCom model 3000 Heart Rhythm Scanner (Biocom Technologies, Poulsbo, WA) with the electrical leads positioned at the following locations: left 2nd rib (ground), right 2nd rib, and lower left torso. Electrocardiograph signals were checked for impedance, collected, filtered for artifacts, and displayed in real-time on a laboratory computer. Participants could not see the instruments’ displays. Participants were placed in a zero-gravity exam chair and fully reclined (supine position) in a room with soft lighting. Once the participant was reclined, he/she was instructed to rest quietly but try to stay awake. During this rest period, blood pressure (BP) (systolic [SBP] and diastolic [DBP]—mean [MAP] pressures) and heart rate measures were assessed every 2.5 minutes for 20 minutes. At the beginning of the second period, participants were asked to stand up as quickly as possible, as the examiner

Greenspan et al helped to return the chair to its upright position. BP and heart rate were recorded within 30 seconds of standing, and BP and heart rate measures were obtained at 1minute intervals for the next 5 minutes. HRV was assessed over the duration of the 5-minute orthostatic period. For Periods 3, 4, and 5, participants were positioned in a chair facing a computer with 4 buttons colored yellow, green, blue, and red. Following the 10-minute rest period (Period 3) and a training period with the computer, participants were exposed to a 5-minute computerized version of the Stroop color-word protocol13 (Period 4). Participants were told, ‘‘During this procedure, the words YELLOW, GREEN, BLUE, or RED appear on the screen in the colors yellow, green, blue or red in a randomly assigned manner.’’ The participants were instructed to press the button corresponding to the color of the word displayed on the screen (eg, if the word RED appeared in yellow the correct response was to depress the yellow key). During Period 5, the Stroop pain-affect protocol was administered.10 Words with different psychological valances (ie, neutral words [garden, armchair] vs words associated with pain and affect [throbbing, miserable]) appeared in the colors yellow, green, blue, or red and the participant’s task was to press the button corresponding to the color of the word displayed on the screen (eg, if the word BLOOD appeared in green the correct response was to depress the green key). The sequence of word presentation and tabulation of results was done using DirectRT hardware and software (Empirisoft, New York, NY). Throughout Periods 3, 4, and 5, BP and heart rate values were assessed at 1minute intervals. HRV was recorded continuously during each of the 5-minute Stroop procedures. The autonomic measures were averaged over the 5-minute color and the 5-minute pain-affect Stroop epoch.

Derived Autonomic Measures Measures of Arterial Blood Pressure and Heart Rate. BP values were recorded in millimeters of mercury (mmHg) and MAP was derived using the following formula: MAP = (SBP 1 2*DBP)/3. Heart rate was measured coincidence with measures of arterial BP in beats-perminute (bpm). Measures and Analyses of HRV. For reviews on the measurement and interpretation of HRV variables, see the recommendations provided by the Task Force of the European Society of Cardiology and the North American Society of Pacing and Electrophysiology14 and by Berntson et al.2 Time Domain Measures: Heart rate was measured as interbeat intervals and a reciprocal value was calculated to derive heart rate values as bpm. SDNN (standard deviation of normal-to-normal [N-N] intervals) is the SD of cardiac cycle interbeat intervals, measured in milliseconds, and it reflects all cyclic components of the variability in the recorded series of interbeat intervals. RMSSD (root mean square of the differences between successive N-N intervals) is the square root of the mean squared differences of successive interbeat intervals, measured in milliseconds, and is an estimate of high-frequency variations

T74.e3 in heart rate in short-term recordings that reflects an estimate of parasympathetic regulation of the heart. Frequency Domain Measures: Total power (TP) is a short-term estimate of the total power of power spectral density in the range of frequencies between 0 and .4 Hz. This measure reflects overall (cardiosympathetic and cardioparasympathetic) autonomic activity. Very low frequency (VLF) is a band in the power spectrum ranging between .0033 and .04 Hz. This measure is not well defined in terms of physiological mechanisms; however, activity in this frequency band has been associated with regulation of the renin-angiotensin system and used as an indicator of activity of slow temporal processes regulated by the sympathetic nervous system. Low frequency (LF) is a band of the power spectrum that ranges between .04 and .15 Hz and reflects both sympathetic and parasympathetic activity. It is a strong indicator of sympathetic activity in long-term recordings. Parasympathetic influence is represented in the LF band when the respiration rate is lower than 7 breaths per minute or during a Valsalva procedure. Thus, when a participant is in the state of relaxation with a slow and even breathing, LF values are indicative of increased parasympathetic activity rather than increased sympathetic regulation. Activity in the LF band has also been related to baroreflex sensitivity with higher values associated with greater baroreflex sensitivity. Greater baroreceptor sensitivity produces greater reflex changes in heart rate for a given change in MAP and is associated with enhanced baroreflex cardioparasympathetic responses to a given change in mean arterial pressure. High frequency (HF) is a band of the power spectrum that ranges between .15 and .4 Hz and reflects parasympathetic activity. HF is also known as a respiratory band because it corresponds to the interbeat interval variations influenced by respiratory sinus rhythm. Slower and even breathing causes an increase in cardioparasympathetic activity and the amplitude of the HF peak in the power spectrum. Frequency domain measures are calculated in milliseconds squared (ms2).

Quality Control for Autonomic Data HRV Data. For time domain parameters: Mean heart rate values were excluded if not between 40 and 200 bpm, mean respiratory rate had to be within 400 and 2,000 ms, SDNN ratings were between 10 and 500 ms, and finally RMSSD measures were within the range 10 to 1,000 ms. Frequency domain parameters had the following limits: TP 10 to 10,000 ms2, VLF 10 to 6,000 ms2, LF 10 to 6,000 ms2, and HF 10 to 6,000 ms2. Only high-quality electrocardiograph signals were collected and subjected to artifact detection and cleaning using Biocom Heart Rhythm Scanner software (version 2.0). After filtering for out of range time and frequency domain values, additional data quality checks were performed to ensure that the TP value was greater than the sum of the HF, LF, and VLF values. To diminish the effects of skewness associated with frequency domain measures (TP, HF, LF, VLF), a logarithmic transformation was performed on the absolute units of the spectral bands within the data set.

T74.e4 BP and Heart Rate Data. SBP measurements were excluded if outside the range of 60 to 220 mmHg. DBP measures had to be within 40 to 110 mmHg. Heart rate measures during the reclined period had to be between 30 and 175 bpm. During all other periods, orthostatic, Stroop-baseline, color-word, and pain-affect heart rate values had to be between 40 and 200 bpm. For the Rest Period (1), measures derived from the BP cuff (SBP, DBP, MAP, and heart rate) were averaged over the last 3 recordings assessed at minutes 15, 17.5, and 20 during the rest period. For the Orthostatic Period (2), the changes in BP— DSBP, DDBP, DMAP—and the change in heart rate from the values obtained during the Rest Period were determined by subtracting the first value measured in the orthostatic period from the mean values obtained at minutes 15, 17.5, and 20 during the rest period. For the Stroop periods (4 and 5): mean BP measures (SBP, DBP, MAP, and heart rate) were averaged over the 5-minute data collection period at minutes 1, 2, 3, 4, and 5. Changes in BP—DSBP, DDBP, DMAP—and the change in heart rate from the pre-Stroop values (Period 3) were determined by subtracting the average value measured in the pre-Stroop period at minutes 5, 7.5, and 10 from the mean of the values obtained during the Stroop tasks, with values at minutes 1, 2, 3, 4, and 5 to compute the mean value. The ratio of [heart rate] O MAP was computed as a measure of the relative balance of cardiosympathetic versus cardioparasympathetic (vagal) tone at a given baroreflex set point (MAP). Higher values reflect greater cardiosympathetic versus cardioparasympathetic tone, whereas lower values signify greater cardioparasympathetic versus cardiosympathetic tone at a given baroreceptor set point.14

Follow-Up and Case-Classification of First-Onset TMD From the time of enrollment through May 2011, study participants were asked to complete a quarterly questionnaire that screened for TMD pain symptoms experienced during the preceding 3 months. Those who reported TMD pain symptoms were asked to attend a clinical examination that determined presence or absence of painful TMD according to OPPERA’s adaptation of a restricted set RDC/TMD.12 In this adaptation, classification of first-onset TMD required 2 criteria: 1) $5 days/month of pain in TMD locations specified by examiner; and 2) examiner findings of arthralgia, myalgia, or both. Arthralgia was based on pain in temporomandibular joint(s) during jaw maneuver or digital palpation, and myalgia was based on pain during jaw maneuver or digital palpation in $3 of 8 muscle groups, each assessed bilaterally: temporalis, masseter, lateral pterygoid, submandibular. All examiners underwent annual training and calibration in RDC/TMD. In blinded, replicated examinations, Kappa statistics for interexaminer reliability of TMD case-classification ranged from .84 to 1.0, signifying excellent reliability.

QST and Autonomic Risk Factors for First-Onset TMD

Statistical Analysis The follow-up period for each study participant was computed as the time from enrollment to the first of 3 possible events: 1) examiner-classification of first-onset TMD; 2) loss to follow-up; or 3) the census date used for this analysis (ie, May 2011). The average annual rate of first-onset TMD was calculated as the number of participants with first-onset TMD divided by the sum of follow-up periods, and the result was expressed as the percent of people per annum (equivalent to the number of incident cases per 100 years of follow-up). For descriptive purposes, an adjusted average annual incidence was computed using a Poisson regression model that adjusted for study site using the Buffalo study site as the referent. To test hypotheses about associations between baseline risk factors and TMD incidence, hazard ratios were computed using Cox proportional hazard regression. Hazard ratios represent the relative difference in hazard rates between 2 groups. Although the hazard rate is a theoretical construct, representing the instantaneous probability of an event as the duration of follow-up approaches zero, it is a good approximation of the average incidence rate in a cohort study. Furthermore, the Cox proportional hazards models used to estimate hazard ratios require fewer statistical assumptions than other modeling methods. Hereafter, we use the term ‘‘incidence’’ when referring to the annual incidence rate and the hazard rate. For the Cox models, incident cases were regarded as an event; otherwise people were censored. Each person’s follow-up period was used as the time-to-event. When the baseline risk factor was categorical, 1 category was nominated as the referent and dummy variables represented each of the other categories. For continuous variables, scores were transformed to unit-normal deviates (mean = 0, SD = 1). When calculating univariate hazard ratios and 95% CIs models were initially adjusted only for study site, and subsequently additionally adjusted for demographic characteristics, that is, age in years, gender (male, female), race/ethnicity (white, African American/Black, Hispanic, Asian, and other/unstated), and lifetime U.S. residence (no, and yes/unstated). Because of the numerous baseline measures of potential risk factors, we used 2 strategies for multivariable analysis. The first strategy used results from previously-reported PCA to generate 5 QST factor scores, and 5 autonomic measure factor scores (separately), which were analyzed by conventional methods using Cox models. All Cox models adjusted for study site and the 4 demographic characteristics described above. Potential nonlinear effects of individual factor scores were investigated by creating 3 categories, based on terciles of the factor scores, which were evaluated using 2 dummy variables in Cox models. Then, all continuous factor scores were evaluated simultaneously as first-order effects in a Cox model. Finally, all possible pairwise interactions between continuous factor scores were evaluated, 1 pair at a time, in separate Cox models (separately for QST and autonomic factors). When 1 or more interactions was statistically significant at P <

Greenspan et al .005, the magnitude of the interaction was quantified by calculating hazard ratios for 1 of the factor scores at 3 levels of the other factor score (–1, 0 and 11), and the results were plotted. This threshold for statistical significance represented the Bonferroni correction for 10 possible pairs of interactions. The second strategy used a set of decision trees to create a random forest model. A decision tree predicts an outcome by recursively partitioning the set of predictor variables producing results that can be visualized as a tree diagram.4 Compared to conventional Cox models, decision trees can readily detect nonlinear effects and interactions, and they are robust against missing values. However, decision trees typically have high variance, resulting in inaccurate predictions.7 Random forests attempt to capture the advantages of trees while reducing their variance by averaging over a series of decision trees. Each decision tree is fit using a subset of the data. The final predictions are obtained by averaging over 1,000 decision trees. Only a subset of the predictors is used in each tree to reduce the correlation between the trees.3,7 For a description of how random forests can be applied to censored survival data, see Ishwaran et al.8 The first objective in fitting the random forest model was to identify the most important variables for predicting first-onset TMD. This was assessed by calculating the importance score representing the decrease in the predictive accuracy of the model when the variable is measured incorrectly. By convention, the most important variable is scaled to have an importance score of 100, and all other importance scores have lower values that could range to a negative value if the variable worsened prediction. The second objective for fitting the random forest model was to estimate the association between each variable and first-onset TMD after adjusting for the effects of all other variables. This was assessed by estimating the expected number of cases of first-onset TMD that would be observed at several values of the variable after averaging over the values of all other variables in the model, and the results were plotted. Partial dependence plots were estimated at 25 values of each predictor variable and a LOESS smoother (and associated 95% CI) was calculated to help visualize the association.9 The random forest models were fitted using the ‘‘randomSurvivalForest’’ package in R, version 2.13.1.8

Evaluation of Bias Of the 3,263 study participants enrolled, 526 returned no quarterly screening questionnaire and were therefore lost to follow-up. Loss was greater among males, African Americans, and at the Baltimore study site. (See details in Bair et al1 in this issue.) Also, baseline levels of stress, anxiety, and negative affect were

T74.e5 greater for people lost to follow-up than for people retained in the cohort. Another form of loss to follow-up occurred for 243 people who reported TMD symptoms during quarterly screening questionnaires but who did not receive a follow-up examination. Compared to the 474 symptomatic people who were examined, the nonexamined people were more likely to be from the Baltimore study site and to report no concurrent headaches and no bodily pain symptoms. (See details in Bair et al1 in this issue.) To account for potential bias created by nonexamination of these 243 symptomatic people, case classifications were imputed using a method described below. Quality assessment of data from reexamination of symptomatic people found a higher-than-expected incidence of TMD classification for 1 examiner, who conducted 75 examinations. The higher incidence was not explained fully by characteristics of the study participants or their pattern of pain symptoms. To account for potential examiner bias, the examiner’s findings were discarded, and case classifications were instead imputed. Examination findings for those 75 examinees and for the 243 symptomatic people who were not reexamined were imputed used 3 steps. First, an algorithm predicting probability of TMD was created using binary logistic regression analysis of quarterly screening questionnaire data among people who were reexamined. The algorithm was then used to generate 100 imputed, binary case classifications for each person who was not examined or whose examination findings were discarded. Finally, the imputed case classifications and 100 replicates of observed case classifications were analyzed in a Cox regression model using multiple imputation to estimate an average hazard ratio and corresponding 95% CI. The imputed analysis adjusted for OPPERA study and demographic characteristics described above. When fitting the random forest models, the 318 people used for imputation were given missing censoring indicators and imputed using adaptive tree imputation. (See Ishwaran et al8 for details.)

Sample Size Considerations OPPERA was designed with a target sample size of 3,200 enrolled study participants expected to yield 196 cases of first-onset TMD during a 3-year follow-up period, assuming 30% loss to follow-up. These targets were based on incidence and cohort retention rates observed in a previous study conducted at the North Carolina study site5 and were sufficient to provide statistical power of 80% to risk ratios of at least 1.8 for risk predictors with as few as 15% of people in the high-risk category, consistent with the magnitude of effect seen for genetic predictors seen in the previous North Carolina study.

T74.e6

References 1. Bair E, Brownstein NC, Ohrbach R, Greenspan JD, Dubner R, Fillingim RB, Maixner W, Smith SB, Diatchenko L, Gonzalez Y, Gordon SM, Lim P-F, RibeiroDasilva M, Dampier D, Knott C, Slade GD: Study protocol, sample characteristics, and loss to follow-up: The OPPERA prospective cohort study. J Pain 14:T2-T19, 2013 2. Berntson GG, Bigger JT Jr, Eckberg DL, Grossman P, Kaufmann PG, Malik M, Nagaraja HN, Porges SW, Saul JP, Stone PH, van der Molen MW: Heart rate variability: origins, methods, and interpretive caveats. Psychophysiology 34: 623-648, 1997 3. Breiman L: Random forests. Machine Learn 45:5-32, 2001 4. Breiman L, Friedman J, Olshen R, Stone C: Classification and Regression Trees. New York, Wadsworth, 1984 5. Diatchenko L, Slade GD, Nackley AG, Bhalang K, Sigurdsson A, Belfer I, Goldman D, Xu K, Shabalina SA, Shagin D, Max MB, Makarov SS, Maixner W: Genetic basis for individual variations in pain perception and the development of a chronic pain condition. Hum Mol Genet 14:135-143, 2005 6. Greenspan JD, Slade GD, Bair E, Dubner R, Fillingim RB, Ohrbach R, Knott C, Mulkey F, Rothwell R, Maixner W: Pain sensitivity risk factors for chronic TMD: descriptive data and empirically identified domains from the OPPERA case control study. J Pain 12:T61-T74, 2011 7. Hastie T, Tibshirani R, Friedman J: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, Springer, 2009

QST and Autonomic Risk Factors for First-Onset TMD 8. Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS: Random survival forests. Ann Appl Stat 2:841-860, 2008 9. Loader C: Local Regression and Likelihood. New York, Springer, 1999 10. Pincus T, Fraser L, Pearce S: Do chronic pain patients ‘‘Stroop’’ on pain stimuli? Br J Clin Psychol 37(Pt 1):49-58, 1998 11. Rolke R, Baron R, Maier C, Tolle TR, Treede RD, Beyer A, Binder A, Birbaumer N, Birklein F, Botefur IC: Quantitative sensory testing in the German Research Network on Neuropathic Pain (DFNS): standardized protocol and reference values. Pain 123:231-243, 2006 12. Slade GD, Bair E, By K, Mulkey F, Baraian C, Rothwell R, Reynolds M, Miller V, Gonzalez Y, Gordon S, Ribeiro-Dasilva M, Lim PF, Greenspan JD, Dubner R, Fillingim RB, Diatchenko L, Maixner W, Dampier D, Knott C, Ohrbach R: Study methods, recruitment, sociodemographic findings, and demographic representativeness in the OPPERA study. J Pain 12: T12-T26, 2011 13. Stroop JR: Studies of interference in serial verbal reactions. J Exp Psychol 18:643-662, 1935 14. Heart rate variability. Standards of measurement, physiological interpretation, and clinical use. Task Force of the European Society of Cardiology and the North American Society of Pacing and Electrophysiology. Eur Heart J 17: 354-381, 1996