International Journal of Osteopathic Medicine 9 (2006) 113e119 www.elsevier.com/locate/ijosm
Research report
Algometer reliability in measuring pain pressure threshold over normal spinal muscles to allow quantification of anti-nociceptive treatment effects Louise Potter a,*, Christopher McCarthy b, Jacqueline Oldham a a
Centre for Rehabilitation Science, University of Manchester, Central Manchester and Manchester Children’s University Hospital’s NHS Trust, Oxford Road, Manchester M13 9WL, UK b Medical School Building, University of Warwick, Coventry, CV4 7AL, UK Received 12 March 2006; received in revised form 20 November 2006; accepted 23 November 2006
Abstract Background: Algometry has been shown to be an effective way of quantifying pressure pain threshold (PPT), although its reliability in assessing spinal muscle pain (excluding trigger points) has not been robustly analysed. Objectives: Intra-rater PPT assessment by algometry over the belly of four pairs of spinal muscles (iliocostalis, multifidus, gluteus maximus and trapezius) in a healthy sample was analysed. Methods: Healthy subjects had their PPT measured twice (within 5 min) on three occasions (separated by a week). Intra-class correlation coefficients and the smallest detectable difference were calculated to analyse the reliability of the measurements and 95% limits of agreement plots were drawn to assess systematic difference. Results: Assessments revealed good within-session reliability (80 assessments) (ICC > 0.91) and good between-session reliability (ICC > 0.87), with a moderate measurement error (approximately 3 kg/cm2) and no systematic difference within-session or between-sessions. Conclusions: PPT assessment by algometry is a reliable, both within-session and between-sessions, measure of a subject’s pain. This study provides further validity to the use of this measure as a suitable, convenient method of monitoring treatment effects. Ó 2006 Elsevier Ltd. All rights reserved. Keywords: Algometry; Pain pressure threshold; Reliability; Treatment effect
1. Introduction Back problems are hugely problematic and costly and as yet there is no clear consensus on treatment approach.1 It is widely accepted that back pain is a multi-factorial, heterogeneous problem and that it has biopsychosocial components,2 making it difficult to assess individual treatment effects. Elements of the * Corresponding author. Tel./fax: þ44 0161 276 6672. E-mail address:
[email protected] (L. Potter). URL: http://www.medicine.manchester.ac.uk/crs/. 1746-0689/$ - see front matter Ó 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.ijosm.2006.11.002
physical component of back pain are physiological tissue changes which provoke nociceptive impulses and cause the perception of pain. One aim of manipulative interventions is to attenuate the nociceptive component of back pain3 thus establishing a valid and reliable method of assessing treatment-induced changes in nociceptive pain would be advantageous. Nociceptive pain is that evoked by a noxious stimulus and as such can be assessed by an externally applied noxious stimulus, as is used in algometry. Algometry has long been used to measure soft tissue pain associated with trigger points4e6 and has been used to assess the effectiveness of treatments which attempt to alleviate these
114
L. Potter et al. / International Journal of Osteopathic Medicine 9 (2006) 113e119
specific tender spots.7 The algometer has been shown to be an effective way of quantifying the pressure pain threshold (PPT)6 relating well to other clinical pain measures and there is evidence to support the reliability of algometry to measure the PPT of trigger points.4 However, there is little evidence for the test-retest reliability of the algometer to record the PPT over muscle not having trigger points. Reeves’ (1986) early study showed the algometer to be reliable in determining the pain pressure threshold of active trigger points in the head and neck, in a small group of headache patients. The study by Fischer (1987) is much cited as providing further evidence of the reliability of the algometer; however, his study collected only one set of measurements for each subject and the results were said to demonstrate reliability on the basis that there was no significant difference detected in the PPT taken from one side of the body compared to the other. Other studies have assessed reliability of measuring trigger points in shoulder and neck problems.4,8 However, there is little evidence for the reliability of algometry to measure PPT in spinal muscles that are not exhibiting trigger points. If the algometer is to be used to assess the effect of an intervention it must first be shown to be unequivocally reliable, with the instrument making equivalent recordings on the same subject on repeated occasions. Thus the purpose of this study was to assess the within-session and between-session intra-rater reliability of the algometer, in the measurement of pain pressure thresholds over nontrigger but consistent muscle points in healthy subjects. A robust analysis of algometry measurement reliability of these muscles has not been conducted previously.
2. Methods
Fig. 1. Surface electrodes attached over iliocostalis and the superimposed crosses show the point that the algometer was applied to the subject.
of initial recordings on subsequent recordings. A research assistant made note of each measurement and the rater was blinded to the previous measurements. 2.2. Equipment The pressure algometer used was a pain threshold meter, model PTH-AF 2, commercially available through
2.1. Subjects This study was conducted on a convenience sample of university staff (n ¼ 10), one-half of the sample were male and one-half female (a grouping that would control for gender differences in PPT),9 a total of eight muscles were tested per person which generated a sample of 80 assessments. Ethical approval for the study was granted from North Manchester Local Research and Ethics Committee. Each volunteer gave informed consent. An asymptomatic subject group (subjects with no current or recent history (<6 months) of spinal pain) was chosen for this study. The subjects attended for three sessions separated by the period of at least a week. On each session the experimental procedure was repeated twice to allow the determination of within-session and between-session reliability. Testing was performed by one examiner (a registered osteopath with 15 years of experience). Muscles were tested in a random order to reduce the influence of the memory
Fig. 2. Algometer being applied to a subject to record the PPT.
115
L. Potter et al. / International Journal of Osteopathic Medicine 9 (2006) 113e119 Table 1 Mean of all tests and visits for individual muscle testing points Muscle
Mean (kg/cm2)
Standard deviation (kg/cm2)
Iliocostalis right Multifidus right Gluteus maximus right Gluteus maximus left Multifidus left Iliocostalis left Trapezius left Trapezius right
8.28 8.99 8.52 7.92 9.45 9.38 3.94 4.23
3.33 3.65 3.27 3.11 3.64 3.85 1.87 1.96
the Pain Diagnostic and Treatment Corporation (Great Neck, NY 11021, USA). The device is a force gauge fitted with a disc-shape rubber tip bearing a surface of exactly 1 cm2. The range of gauge is 0e10 kg, divided into onetenth of a kilogram. All readings are expressed as kilograms per square centimetre (kg/cm2). 2.3. Procedure This experiment was part of a pilot for a multifaceted electromyography randomised controlled trial, for which each subject had four sets of surface electrodes attached over the following superficial muscle pairs: upper trapezius, iliocostalis, multifidus and gluteus maximus. The electrode’s position was recorded on acetates along with other surface markers, such as moles or scars to enable exact replication of electrodes’ placement at each visit. The test sites for the algometry measurements were just medial to the joining point of the two electrodes (see Fig. 1), except for the trapezius muscle electrode where the testing point was just superior. This ensured that testing was approximately in the centre of the muscle belly and produced a method for ensuring that measurements were taken over a consistent point on each testing occasion. This method ensured exact replication of testing site both intra-session and between-session. Once the point to be tested was identified the pressure gauge was applied perpendicular to the surface of the subject’s body (see Fig. 2). Pressure was applied at a constant rate of approximately 1 kg/s.6 Subjects were
instructed to say ‘yes’ as soon as the sensation of pressure changed to pain. It was carefully explained to each subject that the purpose of the experiment was to test his or her pain threshold and that it was not a measure of his or her pain tolerance. The subjects attended on three occasions (hereafter referred to as a visit 1, 2 or 3) and at each visit the muscles were tested in the same order on two occasions (hereafter referred to as test 1 and 2) 5 min apart. Subjects had a practice session on the back of each hand before the main testing commenced. Overall each subject had six tests of eight muscles making for a total of 48 measurements per subject. 2.4. Data analysis Intra-class correlation coefficients (ICCs) were calculated to examine the strength of agreement between tests. The ICC represents the proportion of variance in the data that is due to the true score variance.10 ICCs were calculated using a two-way random effects model,11 as each subject was assessed by the same rater. This allows for the generalisation of the results to other raters. ICC model (2,1) was used for within-session analysis and (2,k) for between-session analysis, where an average of scores across trials was used. ICCs were calculated using the Statistical Package for the Social Sciences (SPSS v11, Chicago, Illinois, USA). The ICC, which is a relative measure of reliability, can be affected by between subject variability so the retest reliability was also considered in the units of measurement (kg/cm2), by calculating the standard error of the mean (SEM) which is an absolute index of reliability, Eq. (1). SEM ¼ SD
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð1 ICCÞ
ð1Þ
where SD is the standard deviation of the grand mean. The smallest detectable difference (SDD) was derived from the SEM and described as appercentage of the paramffiffiffi eter’s grand mean ðSDD ¼ 1:96 2SEMÞ. The SDD provides a clinically relevant assessment of the change that is required in a measurement to be sure (with 95% certainty) that the change is due to true difference and not
Table 2 Intra-rater, within-session reliability of PPT measurements of individual muscle points Muscle
ICC (CI) Test 1/2 at visit 1
Test 1/2 at visit 2
Test 1/2 at visit 3
Iliocostalis right Multifidus right Gluteus maximus right Gluteus maximus left Multifidus left Iliocostalis left Trapezius left Trapezius right
0.85 0.92 0.91 0.93 0.90 0.80 0.84 0.96
0.90 0.70 0.93 0.89 0.95 0.87 0.96 0.93
0.80 0.90 0.88 0.92 0.92 0.98 0.90 0.96
(0.52e0.96) (0.74e0.98) (0.70e0.98) (0.77e0.98) (0.68e0.97) (0.41e0.95) (0.50e0.96) (0.86e0.99)
(0.66e0.97) (0.20e0.91) (0.76e0.98) (0.65e97) (0.82e0.99) (0.57e0.96) (0.86e0.99) (0.74e0.98)
This table shows the ICC and 95% confidence intervals of the individual muscle point’s algometry measurements.
(0.42e0.95) (0.67e0.97) (0.61e0.97) (0.71e0.98) (0.74e0.98) (0.92e0.99) (0.67e0.97) (0.85e0.99)
116
L. Potter et al. / International Journal of Osteopathic Medicine 9 (2006) 113e119
Table 3 Intra-rater, within-session reliability of PPT measurements of the pooled data ICC(2,1) CI
Within-session 0.91 test 1/2 visit 1 Within-session 0.91 test 1/2 visit 2 Within-session 0.93 test 1/2 visit 3
Table 5 Intra-rater between-session reliability of PPT measurements of the pooled data ICC(2,k) CI
SEM SDD SDD % (kg/cm2) (kg/cm2) grand mean
0.87e0.94 0.89
2.48
38
0.87e0.94 1.13
3.12
39
0.90e0.96 1.02
2.83
34
SEM SDD SDD % (kg/cm2) (kg/cm2) grand mean
Between-session Visit 1/2 Between-session excluding iliocostalis
0.87
0.79e0.91 1.22
3.38
47
0.91
0.85e0.95 1.01
2.80
40
This table shows the intra-rater, within-session ICCs with confidence intervals, the calculated SEM, SDD and the SDD presented as a ratio of the parameter’s grand mean, of PPT measurements using a handheld algometer.
Between-session Visit 2/3 Between-session excluding iliocostalis
0.95
0.93e0.97 0.82
2.29
28
0.96
0.93e0.98 0.76
2.12
28
attributable to measurement error. Finally to check for systematic error 95% limits of agreement plots were drawn.
This table shows the intra-rater, between-session ICCs with confidence intervals, the calculated SEM, SDD and the SDD presented as a ratio of the parameter’s grand mean, of PPT measurements using a handheld algometer.
3. Results All recruited subjects (n ¼ 10) completed the trial attending for all three sessions. The mean PPT for each muscle pair is presented in Table 1. The test-retest reliability results, within-session, are presented in Table 2. As the range of ICCs for the individual muscle points (0.80e0.99), presented in Table 2, signified excellent reliability the individual muscle point data were pooled for each trial, producing the overall picture as presented in Table 3. ICC values show good (>0.91)12 within-session reliability. SDDs (that ranged from 2.48 kg/cm2 to 3.12 kg/cm2) are also presented in Table 3, showing that a PPT change of approximately 3 kg/cm2 (i.e. 35e 40%), as measured by the algometer, could be attributable to real change with 95% confidence intervals. Similarly, the between-session analysis showed good reliability with an ICC of 0.87 between sessions 1 and 2 and 0.95 between sessions 2 and 3. The calculated SDD showed 3.00 kg/cm2 of change in PPT, similar to that of within-session measures that could be attributed to real change, with 95% confidence intervals. Table 4 displays the individual muscle point’s ICC revealing Table 4 Intra-rater, between-session reliability of PPT measurements of individual muscle points Muscle
Iliocostalis right Multifidus right Gluteus maximus right Gluteus maximus left Multifidus left Iliocostalis left Trapezius left Trapezius right
ICC (CI) Mean test 1/2 for visit 1/2
Mean test 1/2 for visit 2/3
0.66 0.81 0.82 0.88 0.89 0.68 0.93 0.91
0.91 0.91 0.96 0.97 0.90 0.93 0.97 0.97
(0.27e0.92) (0.29e0.95) (0.34e0.96) (0.56e0.97) (0.57e0.97) (0.20e0.92) (0.74e0.98) (0.66e0.98)
(0.67e0.98) (0.65e0.98) (0.86e0.99) (0.89e0.99) (0.62e0.97) (0.72e0.98) (0.88e0.99) (0.90e0.99)
a slightly wider range (0.66e0.97) although only two of the 16 scores (those for the muscle iliocostalis) are outside of the range classified as demonstrating good reliability (>0.75). Iliocostalis as an individual muscle point was less reliable; data from this point were excluded and the analysis was re-run, this is also presented in Table 5 and shows an improved reliability and smaller measurement error (<3 kg/cm2). Fig. 3 shows the within test differences (tests 1 and 2) plotted against the mean of score of the two tests, separately for week 1, 2 and 3 assessment visits. There is no systematic difference observable between tests. The differences between sessions (visits 1, 2 and 2, 3) plotted against the mean score of tests 1, 2 and 2, 3 (using the mean of tests 1, 2 calculated for each visit) of the tests are presented in Fig. 4. Again, there is no systematic difference between visits. It should be noted from Figs. 3 and 4 that the data displayed a heteroscedastic distribution with the degree of variability tending to become far greater for the higher PPT scores. This has implications in the interpretation of variability statistics (SD) that may be an overestimate for low scores and underestimate for high scores. The PPT values for muscles of the lower spine were higher than those muscles more cranially, shown in Table 1. T-tests revealed a statistically significant ( p ¼ 0.01) difference between the values for the upper trapezius and iliocostalis muscle testing sites, both on the left and right sides.
4. Discussion This study shows that algometry is a stable method for evaluating the PPT over spinal muscles in a healthy
L. Potter et al. / International Journal of Osteopathic Medicine 9 (2006) 113e119
Intra-visit algometry test1/2 visit 1
4
Difference test 1/2 visit 1 (kg/cm2)
117
+2 SD
2
2.109
Mean
0
-0.379
-2 -2 SD -2.868
-4
0
2
4
6
8
10
12
14
16
Mean algometry test 1/2 visit 1(kg/cm2)
Difference test 1 test 2 visit 2 (kg/cm2)
Intra-visit algometry test1/2 visit 2 4
+2 SD 3.197
2
Mean
0
-.073
-2 -2 SD
-4
-3.344
0
2
4
6
8
10
12
14
16
Difference test 1 test 2 visit 3 (kg/cm2)
Mean test 1 test 2 visit 2 (kg/cm2) Intra-visit algometry test1/2 visit 3
4
+2 SD 2.729
2
Mean
0
-.199
-2 -2 SD -3.127
-4
0
2
4
6
8
10
12
14
16
18
Mean test 1 test 2 visit 3 (kg/cm2) Fig. 3. The intra-session differences between algometry measurements for each week, plotted against the mean of the measurements. The mean is represented by the solid line and 95% limits of agreement by the dashed lines (n ¼ 80).
population and that it has a relatively small measurement error. No previous studies have robustly analysed, the test-retest reliability or the measurement error of the algometer, for evaluating the PPT over spinal muscles and so this study provides new evidence, that adds support to
earlier reliability studies.4,13 However, this study is in disagreement with that of Hogeweg et al.,14 who reported a significant difference in healthy volunteers PPT within-session, but not between-session, after measuring paraspinal muscle points. The authors suggested that it
118
L. Potter et al. / International Journal of Osteopathic Medicine 9 (2006) 113e119
Between session visit 1/2
Difference visit 1 visit 2 (kg/cm2)
6 4
+2 SD 2.42
2 0
Mean -1.51
-2 -4
-2 SD -5.44
-6 -8
0
2
4
6
8
10
12
14
16
Mean visit 1 visit 2 (kg/cm2) Between session visit 2/3
Difference visit 2 visit 3 (kg/cm2)
6
+2 SD
4
3.65 2 Mean 0.36
0 -2
-2 SD -2.92
-4 -6 -8 0
2
4
6
8
10
12
14
16
Mean visit 2 visit 3 (kg/cm2) Fig. 4. The differences between algometry measurements for the average of the tests for visits 1/2 and 2/3 plotted against the mean of the measurements. The mean is represented by the solid line and the 95% limits of agreement by the dashed line (n ¼ 80).
might be due to sensitisation or habituation, which is not supported by the results of our study. Based on the results of this study a change in PPT of approximately 3 kg/cm2 (95% CI 2.28e3.38) would be necessary to be sure that the outcome was due to a treatment effect and could not be attributable to chance or measurement error ( p < 0.05). Previous studies have reported a decrease in PPT of between 1.1 kg/cm2 and 2.6 kg/cm2 as representing a significant change between groups in intervention studies.6,7,15 These studies have tended to look at myofascial points in the head and neck, but even so the results should be interpreted with caution as it is possible that the observed changes in PPT might have been due to measurement error alone. This work would suggest the amount of PPT change required, to be confident of measuring a true change and not simply measurement error, may be slightly higher than previously cited but still of a level that does not render the measure as clinically unhelpful. This work has found an increase in PPT from lower values cranially, to higher values over more caudal muscle points. The PPT of iliocostalis and gluteus maximus
points were significantly higher ( p < 0.05) than the upper trapezius points, consistent with the findings of Fischer (1987) and Hogeweg et al. (1992).13,14 Therefore, this study may have a degree of criterion-related validity via its similarity in findings with these previous studies. The explanation for this regional difference is unclear. One plausible theory is that this phenomenon may be due to the higher number of proprioceptors in the cervical region as compared with the lumbar region.8 This might also account for the poorer PPT reliability observed in the iliocostalis muscle points. It would be expected that in a symptomatic population greater reliability, of the algometer, would be achieved as it would be less likely that higher, more variable PPT readings, would be recorded. There was no demonstrable systematic difference seen either between tests or between sessions indicating that there was no training effect. This suggests that the algometer could be used as a single pre-test and post-test measure. Normally mechanical nociceptive impulses are generated from intense pressure being applied through the
L. Potter et al. / International Journal of Osteopathic Medicine 9 (2006) 113e119
skin activating small diameter thinly myelinated Ad fibres, which conduct impulses to lamina I and V in the dorsal horn, of the spinal cord.16 Tenderness occurs when the non-noxious stimulus is sufficient to stimulate nociceptive impulses in muscle mechanoreceptors, in which thresholds have been lowered. It is thought that segmental dysfunction causes a lowering in the threshold of nociceptors17 thus enabling a normally non-noxious stimulation sufficient to pass threshold level and cause a nociceptive impulse to be generated. Manual treatments are thought to alter the mechanoreceptor activity in the soft tissues and thereby decrease tissue tenderness.18 Since PPT is synonymous with tenderness, evaluating PPT provides a method for measuring the anti-nociceptive effect of manual treatments. Only two papers report using algometry to measure treatment effects for therapeutic interventions in low back pain.19,20 Hsieh and Lee19 conducted a study comparing the effect of one-shot percutaneous electrical nerve stimulation and transcutaneous electrical nerve stimulation using changes in VAS, body surface score in addition to PPT. Coˆte´ et al.20 assessed the PPT preand post-spinal manipulation. Neither of these studies reported a significant change in PPT following the intervention. Despite this algometry does appear to offer a simple, practical method of analysing treatment effect. In conclusion within-day and between-day reliability of the algometer in quantifying the PPT of normal spinal muscles was good with moderate measurement error. Furthermore the technique is easy to perform and thus makes it an ideal tool for quantifying, an objective measure of a subject’s pain, allowing researchers and clinicians to monitor treatment effects. References 1. Koes BW, van Tulder MW, Ostelo R, Kim BA, Waddell G. Clinical guidelines for the management of low back pain in primary care: an international comparison. Spine 2001;26:2504–13. 2. Waddell G. The biopsychosocial model. Chapter 14. In: The back pain revolution. London: Churchill Livingstone; 2004. p. 265–81.
119
3. Kramis RC, Roberts WJ, Gillette RG. Non-nociceptive aspects of persistent musculoskeletal pain. J Orthop Sports Phys Ther 1996; 24:255–67. 4. Reeves JL, Jaeger B, Graff-Radford SB. Reliability of the pressure algometer as a measure of myofascial trigger point sensitivity. Pain 1986;24:313–21. 5. Ohrbach R, Gale EN. Pressure pain thresholds, clinical assessment, and differential diagnosis: reliability and validity in patients with myogenic pain. Pain 1989;39:157–69. 6. Fischer AA. Application of pressure algometry in manual medicine. J Manag Med 1990;5:145–50. 7. Hong CZ, Chen YC, Pon CH, Yu J. Immediate effects of various physical medicine modalities on pain threshold of an active myofascial trigger point. J Musculoskelet Pain 1993;1: 37–53. 8. Vanderweee¨n L, Oostendorp RAB, Vaes P, Duquet W. Pressure algometry in manual therapy. Man Ther 1996;1:258–65. 9. Chesterton LS, Barlas P, Foster NE, Baxter GD, Wright CC. Gender differences in pressure pain threshold in healthy humans. Pain 2003;101:259–66. 10. Weir JP. Quantifying test-retest reliability using the intraclass correlation coefficient and the SEM. J Strength Cond Res 2005; 19:231–40. 11. Shrout PE, Fleiss JL. Intraclass correlations: uses in assessing rater reliability. Psychol Bull 1979;86:420–8. 12. Portney LG, Watkins MP. Foundations of clinical research: applications to practice. Norwalk: Appleton & Lange; 1993. 13. Fischer AA. Pressure algometry over normal muscles. Standard values, validity and reproducibility of pressure threshold. Pain 1987;30:115–26. 14. Hogeweg JA, Langereis MJ, Bernards AT, Faber JA, Helders PJ. Algometry. Scand J Rehabil Med 1992;24:99–103. 15. Pratzel HG. Application of pressure algometry in Balneology for evaluation of physical therapeutic modalities and drug effects. J Musculoskelet Pain 1998;6:111–37. 16. Kandel ER, Schwartz JH, Jessell TM. Principles of neural science. 4th ed. London: McGraw-Hill Companies Inc.; 2000. 17. Fryer G. Somatic dysfunction: updating the concept. Aust J Osteopathy 1999;10:14–9. 18. Sterling M, Jull G, Wright A. Cervical mobilisation: concurrent effects on pain, sympathetic nervous system activity and motor activity. Man Ther 2001;6:72–81. 19. Hsieh RL, Lee WC. One-shot percutaneous electrical nerve stimulation vs. transcutaneous electrical nerve stimulation for low back pain: comparison of therapeutic effects. Am J Phys Med Rehabil 2002;81:838–43. 20. Coˆte´ P, Mior SA, Vernon H. The short-term effect of a spinal manipulation on pain/pressure threshold in patients with chronic mechanical low back pain. J Manipulative Physiol Ther 1994;17: 364–8.