Journal of Electromyography Vol. 4, No. 1, pp 47-59
~~~~*~~~~~~~~~~inernann
and
Kinesiology
Ltd
A Study of Various Normalization Procedures Day Electromyographic Data Loretta M. Knutson’,
for Within
Gary L. Soderberg’, Bryon T. Ballantyne2 and William R. Clarke3
‘Department of Physical Therapy, Creighton University, 2500 California Plaza, Omaha; 2Physical Therapy Graduate Program, The University of Iowa, 2600 Steindler Building, Iowa City; ‘Department of Preventive Medicine and Environmental Health, The University of Iowa, 2821 Steindler Building, Iowa City, USA
Summary: Normalization of electromyographic (EMG) data has been described literature as crucial for comparisons between subjects and between muscles. The reference value used in the normalization equation has, however, varied across reports. Comparison between studies could be facilitated by use of a common value. We propose the best way to select the common value is through a reliability approach. Accordingly, the purpose of this study was to identify which of three EMG normalization values provided the most reproducible data set. The gastrocnemius EMG results from 20 normal persons and 20 individuals with anterior cruciate deficiency who were participating in a larger study were normalized to a maximum voluntary isometric contraction (MVIC) EMG, peak dynamic EMG, and mean dynamic EMG. Values were then subjected to evaluation using four statistical measures: inter and intrasubject coefficients of variation (CV), variance ratio (VR), and intraclass correlation coefficient (ICC). The CV measures, while not being reflective of reliability were included for comprehensive consideration in view of other literature. The intersubject CV which measures group variability and the intrasubject CV which measures precision were lower for the dynamic conditions, however, the VR and ICC suggested reproducibility was best with EMG from the MVIC. Given that other studies have advocated normalizing EMG by taking data from the dynamic event, reconsideration may be warranted if high reproducibility is desired. Interpretations of the findings in the scientific
given the population,
muscle
Key Words: Electromyography ducibility.
and condition
studied
are discussed.
(EMG)-Reliability-Normalization-Repro-
INTRODUCTION Normalization of electromyographic (EMG) measurements is necessary to allow comparisons between subjects, days, muscles, or studies. The most commonly employed normalization procedure has been to use EMG data taken from the maximum
Received November 23, 1992. Accepted November 17, 1993. Address correspondence and reprint requests to Loretta M. Knutson, PhD, Physical Therapy Department, Creighton University, 2500 California Plaza, Omaha, NE 68178, USA.
47
48
L. M. KNUTSON
voluntary isometric contraction (MVIC)2,8J9,24,33,34. In the past decade investigators have suggested using alternative normalization values, particularly when the event of interest is dynamic versus static. Suggestions have included using (a) the EMG associated with a percentage of the MVIC34,35, (b) the peak EMG during a dynamic activity15J6,35 or (c) the mean EMG during a dynamic activity32*35. Because use of a common reference value would facilitate comparison between studies, efforts to select the best value are warranted. We propose using a reliability approach to select the best value because sample data which is reproducible will more accurately reflect the population. Only one study has used a reliability focus to compare contraction types and comment on which was best relative to use for normalization. In that study values were taken from static events, MVIC vs. submaximal MVIC34. No similar studies have been reported studying dynamic events. From a perspective other than reliability, Yang and Winter suggest that reference values from dynamic contractions are preferred over those from isometric contractions because the former produce lower intersubject coefficients of variation (CV)35. The relationship of intersubject CV to reliability was not addressed. Studies of EMG reliability date back to Lippold’s 1952 publication showing variability in the EMG tension relationship across 10 experiments’*. Since that time, a number of studies have addressed or purported to address the issue of EMG reliability. A summary of these works is shown in Table 1. Comparisons between studies is difficult because investigators used a variety of statistical measures their data, including the to interpret CV12~15~26~28~30~32~34~35, the variance ratio (VR)14, the intraclass correlation coefficient (ICC)1,2,34, and coefficientsl,s,9,10,12,15,17,27,28 correlation other Additionally, some investigators normalized their data, while others did not. To answer questions about EMG normalization and reproducibility for a dynamic activity, a comprehensive study using various normalization procedures and different statistical measures was indicated. Our purpose was to determine which of three normalization values would be associated with the highest reproducibility for a given data set. To accomplish these purposes, surface EMG recordings of the gastrocnemius muscle were made during a cyclic activity in normal subjects and subjects with anterior cruciate deficient knees. The three EMG normalization reference values used in the analysis Journalof Electromyography
& Kinesiobgy
Vol. 4, No. 1, 1994
ET AL.
were derived from MVIC, peak of the dynamic contraction (peak-d), and mean of the dynamic contraction (mean-d). The four statistical values calculated for each condition of normalization were the intersubject CV, intrasubject CV, VR, and ICC.
REVIEW OF LITERATURE Reliability Terminology
Applied in EMG Research
Researchers have applied multiple terms to define reliability. Viitasalo and Komi defined reliability as the reproducibility of measurements within a test session and constancy as the reproducibility of measurements between test days27. This terminology was retained by Hershler and Milner in their 1978 study designed to use the VR as the statistical criterion to identify the optimal envelope processor for full wave rectified EMG signals”. Kadaba et al. also retained this terminology in their study on effect of electrode types14. Gollhofer et al. recommended replacing the term reliability with reproducibility to cover both stability plus the aspects of linear changes and scattering of data in repeated measurements’. Their term stability appears to be the same as constancy used by other investigators11*14,27. Although investigators have chosen to use a variety of terms to discuss reliability, from a statistical point of view, the terms reliability, reproducibility, repeatability, and consistency are synonymous. Attempts have been made by electromyographers and others to assign different definitions to each of these terms, yet these definitions have not been universally accepted. The inconsistent use of terms can be confusing. Table 2 specifies terms and definitions defined in the dictionary and statistical literature. We use the term reproducibility because this term best reflects the central question of ability to achieve similar results on repeated testing. Precision is not the same as reproducibility. The assessment of both reproducibility and precision require repeat testing, but precision focuses on the magnitude of measurement error, whereas reproducibility addresses the true difference in measured values. The formula to calculate precision has been misapplied to connote reproducibilityl5,26,28,30,32,34, The formulae for each are addressed in the next section.
NORMALIZATION TABLE 1. Summary
of studies
addressing
electrom yographic
(EMGl reliability
or variability
Normalized (method)
Analytical technique
Results
No
Subjective O-3 grading
Subjective report of greater differences between than within day
SMVIC
No
Pearson product moment correlation
0.92
S & FW
SMVIC MVIC Isotonic
No
Reliability coefficient
Within day isometric 0.88 (S) 0.82 (RN) Between day isometric 0.69 (Sj 0.22 (MI) Between day isometric 0.93 (S)
RF
S
SMVIC MVIC
No
Reliability coefficients
Within day 0.77-0.92 between day 0.34-0.88
6
VL, RF
S
Gait
No
Variance ratiovariability present; no statistical results .._ given
8
TB
S
SMVIC
No
Pairwise correlation
0.87-0.99
RF, VL, WM
S
MVIC
No
Correlation coefficient
0.96-0.98
TB
S
SMVIC MVIC
Yes
lntraclass correlation coefficient (ICC)
At 30% MVIC 0.80-0.95 At 50% MVIC 0.78-0.93 At 100% MVIC 0.52-0.81
Researcher and year
Focus of reliability or variability component
Jonsson & Reichman 1968’s
Within day, between day and electrode position change reproducibility
De Vries 1968s
Between day reliability of ‘efficiency of electrical activity’
15
Not stated S
Komi & Buskirk 1970”
Within day, between day, electrode type, and contraction type reliability
37
BB
Within day ‘reliability’ and between day ‘constancy’
12
Hershler & Mimer 1978”
Effects of speed, electrode position and processing on footstep repeatability
Graham 1979’O
Between day reliability
y;;o;ilo
Within day reproducibility
et al.
49
PROCEDURES
”
6
29
Muscle’
Electrode D/pet
Mechanical condition*
BR
FW Isometric Bipolar IM Isotonic Unipolar IM
Yang & Winter 1983”
Effects of contraction levels on within and between day reliability
9
Winter 1984s”
Demonstrate computer averaged EMG patterns applied to diagnose pathology; emphasize need for reliable profiles
11
SO, TA, BF, VL, RF, GMAX
S
Gait
No
Coefficient of Within one patient variation (CV) 2538% rvfo;rla not Between normals 41-91%
Yang 94 Winter 19843s
Determine effect of normalization method on intersubject variability
11
SO, TA, S BF, VL, RF
Gait
No and yes (50% MVIC, peak dynamic and mean dynamic)
CV (formula indicates intersubject CV)
Unnormalized 49-l 28% Normalized to 50% MVIC 52-197% Normalized to peak dvnamic 35-56% Nbrmalized to mean dynamic 32-56%
Kadaba et al. 1985’*
Influence of electrode type on repeatability of phasic activity across cycles, runs, and days
10
G, TA, MH, VL. RF
S & FW
Gait
No
Variance ratio
Cycles 0.17-0.23 (S) 0.17-0.28 (RN) Runs 0.17-0.27 (S) 0.21-0.36 (FW) Days 0.48-0.58 (S) 0.52-0.67 (MI)
Arsenault et al. 1986’
Within day repeatability related to validating a ‘normal’ EMG profile
8
SO, TA, BF, VM, RF
S
Gait
Yes (MVIC)
ICC
0.84-0.99
Arsenault et al. 19862
Within day reliability influenced by number of strides analysed
8
SO, TA, BF, VM, RF
S
Gait
Yes (MVIC)
ICC
3 strides, 3 subjects, 0.96-0.99 10 strides, 8 subjects 0.99+
Winter & Yack 1987=
Detailed consideration of variability between subjects
lo-19
16 muscles
S
Gait
No and yes CV (formula (Mean during indicates stride) - intersubject CV)
Journal
of
Elecrromyography
Normalization reduces variability; some muscles more variable than others
& Kinesiology Vol. 4, No. 1, 1994
L. M. KNUTSON
50
ET AL.
TABLE 1. Continued Researcher and year
Focus of reliability or variability component
”
Di Fabio 1987s
Between day reliability of EMG onset by computerized analysis and examiner judgement
154
Muscle*
Electrode typet
G, TA, MH, VL
S
Mechanical condition*
Normalized (method)
Analytical technique
Results
Perturbed standing
No
Per cent agreement,
Computerized analysis more
Pearson correlation coefficient, ICC
reliable (1.00) than raters (0.78-0.82)
0.76-0.97
Horstmann et al. 1988’2
Between day reproducibility
12
G, SO, TA S
Stance with and without perturbation
No
Pearson correlation coefficient, analysis of variance /ANOVA)
Kadaba et al. 198915
Within and between day intrasubject repeatability
40
10 lower extremity muscles
S
Gait
Yes (max in cycle)
‘Coefficient of Within day 0.76-0.90 multiple Between day correlation’ 0.66-0.88
Giroux & Lamontagne 19908
Within and between day reliability influenced by electrode type and contraction
6
Trap, MD, AD
S 81 FW
Isometric 84 Yes dynamic work(MVIC) task
Unable to interpret ANOVA, Pearson correlation coefficient and t-test per results
Gollhofer et al. 1990s
Between day reproducibility during four types of stretch/shortening contractions
12
G.
S
Dynamic
No
SpearmanBrown reliability coefficient
Veiersted 1991z6
Within day ‘reproducibility’ in submaximal arm raising
12
Trap
S
Isometric contractions
Yes (MVIC)
cv 23% (intersubject)
SO
0.63-0.97
*Muscle key: AD = anterior deltoid; BB = biceps brachii; BF = biceps femoris; BR = brachioradialis; G = gastrocnemius; Gmax = gluteus maximus; H = hamstrings; MD = middle deltoid; MH = medial hamstrings; RF = rectus femoris; SO = soleus; TA = tibialis anterior; TB = triceps brachii; Trap = trapezius; VL = vastus lateralis; VM = vastus medialis. tElectrode type key: S = surface; FW = fine wire; IM = intramuscular. *Mechanical condition: SMVIC = submaximal voluntary contraction; MVIC = Maximal voluntary contraction.
Measurement of Intersubject Variability, Precision and Reproducibility As with using multiple terms to describe reproducibility, using a variety of statistics can complicate data interpretation. While not a measure of reproducibility, intersubject CV is included because this value can hold an inverse relationship with reproducibility and lowering the value has been reported as a criterion for advocating one normalization value. The intersubject CV is obtained without test replication and is a descriptive variable on one data set. Specifically, the value reflects dispersion of data around the mean of a single data set and is calculated from the square root of the sample variance or standard deviation (SD) divided by the mean x (intersubject CV = ~/SD /x). Neither high nor low values of the CV are considered good or bad. Some degree of variability is needed to demonstrate reproducibility. In the extreme case, a data set void of variability makes consideration of reproducibility a mute point. By contrast, high Journalof Electromyognzphy &
Kinesiology Vol. 4, No. 1, 1994
intersubject variability lends itself to a greater chance of finding reproducible results with study replication. On the other hand, low intersubject CV suggests group homogeneity, an attribute that can be desirable from the viewpoint of creating a diagnostic window or template against which later evaluations can be judged. This is the basis on which the mean dynamic EMG has been advocated as the normalization value when studying dynamic events35. The intrasubject CV is more related to reproducibility than is the intersubject CV because the intrasubject CV is based on subject repeat measures, thus estimating the magnitude of pure measurement error. Although researchers have used the intrasubject CV as an index of reproducibility14*30, the value speaks more to precision and thus should not be considered indicative of reproducibility. Intrasubject CV is the square root of the mean squared error (MSE) across trials divided by the mean (x) of all observations, (intrasubject CV = VMSE /x). A
NORMALIZATION TABLE 2.
Definitions
of commonly
used terms
Term
Definition
Reliability
The consistency with which a measure assesses a given trait3. The extent to which an experiment, test or measuring procedure yields the same results on repeated trialszg. The consistency or repeatability of measurements; the degree to which measurements are error free and the degree to which repeated measurements will agreez5.
Replicability
Duplicate or repeat (a statistical experiment) one of several identical experiments, procedures, or samplesz4.
Reproducibility
The ability to produce againz9.
Repeatability
The ability to make appear again; reproducezg.
Consistent
Tending to be arbitrarily close to the true value of the parameter estimated as the sample becomes largez9.
Constancy
Steadfastness of mind under duress, freedom from changez9.
Precision
The quality or state of being precise; exactness. The degree of refinement with which an operation is performed or a measurement statedz9.
low intrasubject CV value is preferable to a high value because a low value represents less error or greater consistency between repeat measures. Appropriate approaches to evaluating reproducibility include the VR, and two types of correlation coefficients, Pearson’s correlation coefficient (r) and the ICC. Pearson’s correlation coefficient deals with association only. The ICC additionally addresses agreement because it is effected by systematic changes in the measurements. This matter can best be demonstrated with a manufactured data set (Table 3). Comparisons are shown for five data sets when scores are identical (columns A vs. B), when one set is incremented by a constant (columns A vs. C), and when subject values have been alternately incremented (columns D vs. E) from the original data set (A). When the two sets are the same (A vs. B) the Pearson’s correlation value r and the ICC are 1.0. The result is perfect association and agreement. When the two sets of data are equivalently spread by 15 points (A vs. C), they will be highly correlated or associated, according to Pearson’s r but because the mean values are not
51
PROCEDURES
in agreement, reliability will be poor as measured by the ICC. Because of this situation, the ICC is preferred over Pearson’s r. When subject values have been alternately incremented (D vs. E) from an original data set (A), such that neither association nor agreement are seen, both correlation coefficients demonstrate poor reliability. The VR and the ICC are functionally related because the VR is approximately equivalent to the ICC minus one. Both measures can be derived from components of an analysis of variance (ANOVA) procedure. See Appendix A for more detail. Basically, the ICC is a measure of similarity among trials relative to difference among subjects’. The formula for the ICC is the ratio of the adjusted between subject variability, true variance, to between subject variance plus the appropriate error term where the appropriate error term is defined by the type of ANOVA procedure employed*r. Thus, the ICC is the between variance divided by the total variance. By contrast, the VR is the error (within) variance divided by the total variance and the value reflects the proportion of total variability which can be explained by error. Low values of VR are desirable. In the case of EMG, when waveforms are similar, the VR tends toward zero and when the waveforms are dissimilar, the VR tends toward one.
Normalization Procedures The literature contains numerous examples of techniques for normalizing EMG data. Whereas MVIC is common as seen in works by Woods and Bigland-Ritchie33, Neumann and Cook19, and Soderberg et al.24 a number of investigators have used submaximal MVIC contractions34*35, or a value taken from the dynamic event under study4J6,20*35. Knuttson and Richards normalized their data to peak dynamic integrated EMG for describing variations in the gait pattern of subjects with hemiplegia and differences between the subjects with hemiplegia and those with normal motor controP6. In a related study on improving the sensitivity of electromyography as a diagnostic tool in gait analysis, Yang and Winter advocated, of four normalization factors, either the peak or mean of the subject ensemble average because these values produced lower intersubject CVs than data normalized to 50% MVIC or to a mean EMG per unit of isometric moment of force calibration method35. Journal
of Electromyography
& Kinesiology Vol. 4, No. 1. 1994
L.M. KNUTSONETAL. TABLE 3. Subject
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Fictitious
data sets and comparison
of analytic
Data set
techniques
Comparisons
A’
B*
C3
D4
E5
5 7 9 11 12 14 16 22 24 23 21 19 17 15 26 19 14 9 15 21
5 7 9 11 12 14 16 22 24 23 21 19 17 15 26 19 14 9 15 21
20 22 24 26 27 29 31 37 39 38 36 34 32 30 41 34 29 24 30 36
20 7 24 11 27 14 31 22 39 23 36
225 9 26 12 29 16 37 24 38 21
;: 15 41 19 29 9 30 6
34 17 30 26 34 14 24 15 21
A vs. B XA XB SDAB SDB
A vs. C
r
15.95 15.95 5.93 5.93 6
XA xc SDA SDC
r
15.95 30.95 5.93 5.93 si;p
t ICC
1.0
:cc
0.23
D vs. E X0 XE SDD SDE ; ICC
22.70 22.70 9.31 10.43 0.21 NS -0.23
‘Original trial data, replication 1. ZOriginal trial data, replication 2. 3Values resulting from adding 15 to each column A score. 4Values resulting from alternately adding 15 to column A scores, starting with subject 1. 5Values resulting from alternately adding 15 to column A scores, starting with subject 2. %D is standard deviation. r = Pearson’s correlation coefficient. t = Student’s t test. ICC = lntraclass correlation coefficient.
METHODS
TABLE 4.
Subjects Data were obtained from 20 normal subjects and 20 subjects with anterior cruciate ligament (ACL) dysfunction who were participating in a larger study addressing differences in neural control mechanisms employed during a balance board activity. All subjects signed consent forms approved by the institutional review committee. Subject descriptive characteristics are shown in Table 4.
Data Collection and Processing Two silver-silver chloride electrodes, 8 mm in diameter with 22 mm between centres, embedded in a plastic mounting measuring 17 X 33 X 10 mm and containing circuitry for preamplification with a gain of 35 were placed over the medial gastrocnemius muscle. Placement was determined by marking 33% of the distance from the fibular head to the distal Journal of Electromyography
& Kinesiology Vol. 4, No. 1, 1994
Subject
descriptive
ACL deficient (n = 20)
Male (n = 13) Age (vrs) Height (in) Weight (Ibs) LOD* (vrs) Female (n = 7) Age (vrs) Height (in) Weight (Ibs) LOD* (vrs)
statistics
Normal
(n = 20)
mean
range
mean
range
26.3 69.8 174.3 7.1
20-38 65-74 140-245 1.75-19
27.2 71.6 180.6
20-47 67-77 150-210
26.3 67.7 139.6 6.6
19-43 63-70 110-165 0.33-27
25.9 65.4 132.0
23-37 63-69 116165
“LOD = Length of deficiency.
edge of the lateral malleolus, measuring the leg circumferentially, then placing the electrode medially from the tibia1 crest, 33% of the circumference. The electrode lead was connected to a main
NORMALIZATION amplifier (model GCS 67, Therapeutics Unlimited, Iowa City, IA) which provided selectable gain from 500 to 10 000, a bandwidth from 40 Hz to 6 kHz, and a common mode rejection ratio of 87 dB at 60 Hz. Input impedance was greater than 15 megohms at 100 Hz. From the main amplifier raw EMG signals were sent to an FM tape recorder (model 3986A, Hewlett Packard, San Diego, CA) for later transfer, at 1000 Hz per channel, to an AST computer (AST Research, Taiwan). Transfer was facilitated by an analogue to digital MetraByte board and a signal transfer software program, Streamer (MetraByte Corporation, Taunton, MA). Five channels of EMG data were recorded, however, only the data from the gastrocnemius muscle were included in this study. The device used for the activity was a computerized platform supporting a balance board (Figure 1). We selected this activity because of clinical relevance to patient rehabilitation of lower extremity pathologies. The cyclic activity yielded data appropriate to the purposes of this study. Prior to performing the activity on the balance board each subject performed three ankle plantar flexion MVICs. These were completed in long sitting with the ankle at 90” as the subject held each end of a towel that passed under the foot. An investigator verbally encouraged the subject and also applied maximum resistance to the subject’s foot. Each subject then completed the task of rotating the board counter-clockwise with the board raised to a height of 7.5 cm. Multiple board revolutions, as many as 10, were completed for each of two trials. All revolutions were completed within 15-30 s. The second set of revolutions, trial 2, was completed after a two min rest. One investigator provided the subject with instructions for standing and rotating on the board and followed this with verbal cues as needed so that similar postures were maintained between revolutions and trials. A complete revolution was defined when the board crossed a switch mounted on the supporting surface and produced a voltage change in the switch output. Data produced from the two trials were used for analysis. Signal processing was performed with the DATAPAC II (Run Technologies, Laguna Hills, CA) software program. After full wave rectifying the EMG data from the MVIC, a computer function was used to obtain the mean of a two to three second portion of the contraction and the highest value of the three contractions was selected for the reference measure. For the task related EMG
53
PROCEDURES
FIG. 1. Subject performing on the balance board prior to EMG electrode application. The backs of hands on the vertical supports are allowed for maintaining balance.
records, signals were full wave rectified, marked according to the cyclic revolutions on the balance board, and averaged across all revolutions within the trial. To accomplish the averaging, cyclic revolutions were time normalized to 100% and 20 points representing the mean within each 5% interval were retained for use when averaging across cycles and forming the subject ensemble averages. Data Analysis Data from the MVIC tests and the cyclic activity were transferred from the personal computer to an IBM mainframe computer. The peak and mean dynamic (peak-d and mean-d) EMG reference Journal
of Electromyography
& Kinesiology
Vol. 4. No. 1, 1994
54
L. M. KNUTSONETAL.
values were then identified with the peak-d being the highest and the mean-d being the mean EMG value found during the ensemble average. The next step was to amplitude normalize the data using a computer subroutine which divided dynamic EMG values from each 5% interval consecutively by the three normalization values. Amplitude normalized subject files were then used to form group means for each trial for both normal and ACL deficient groups. Descriptive statistics were determined on each trial and group and included means, standard deviations, and intersubject CVs for each 5% interval of the ensemble average. Also for each 5% interval, one-way ANOVAs were conducted comparing the two trials for both normal and ACL deficient groups for each normalization procedure. The resulting ANOVA tables yielded the variance components necessary to calculate the VR, the ICC, and the intrasubject CVs. The values of importance to the purpose of this study were: intersubject CV, intrasubject CV, VR and ICC. The reported intersubject CV was based only on the second of the two consecutive trials and addresses the variability between individuals in the group for that trial. The other values represent the contrast between the two similar trials completed within the same day.
MEAN-D
NORMAL
240 220 ” 5
200 180
3
160
g
140
1
120
p
100
z
80
8
60
L
t
40 20 0 PERCENT
OF
PEAK-D
CYCLE
NORMAL
240 220 200
E
01 t 0
5
I’(
10
15 20
t
25
1”‘11”1’~“’
JO 35
40
45
50
PERCENT
55
OF
60
65
70
75
60
65
90
95100
65
70
75
60
65
90
95100
1
CYCLE
RESULTS Data normalized to MVIC, peak-d and mean-d for the two trials and averaged across all normal subjects are shown in Figure 2. Similar data is shown in Figure 3 for subjects with ACL deficient knees. The group values for the three normalization techniques vary from less than 20% for MVIC to almost 140% for mean-d due to values of the denominators used in the normalization equation. Although data points were obtained for each of the 20 intervals, focus for presentation of results is directed to the three most representative values, the median, the lowest value, and the highest value across the intervals for both the normal and ACL deficient groups. The median was chosen over the mean because the median is less sensitive to extreme values. Table 5 shows the intersubject CV results for one trial and Table 6 shows the results from the precision and reproducibility focus across two trials. To determine whether or not the ICC values reported in Table 6 were significantly different, a Friedman test of nonparametric data was performed for each subject group. This test, based on EMG JournalofElectromyography
& Kinesiology Vol. 4, No. 1, 1994
MVIC
-
NORMAL
45
50
240 220 200 ;
160
g
160
r 9 K 0 =
140 120 100
:: ‘;
60
B P
60 40 20
OL 0
5
10
15 20
25
JO 35
,o
PERCENT
OF
55
60
CYCLE
FIG. 2. EMG normalized by three different techniques for two trials for the normal subjects. Means are grand means for all subjects. Standard deviation lines are shown only unidirectionally. Light lines directed upwards are for trial 1 data. Darker lines directed downwards are for trial 2 data.
55
NORMALIZATIONPROCEDURES
for all 20 intervals of the gait cycle, found significant differences between the three normalization values for normal subjects (x’ = 24.70, 2df, P < 0.001) and for ACL subjects (x2 = 13.30, 2df, P < 0.001). Follow-up tests for normal subjects showed the ICC value was significantly greater (P< 0.05) for the data normalized to MVIC vs. peak-d or mean-d. No significant difference in the correlation values was found between mean-d and peak-d. Follow-up tests for ACL subjects showed the ICC values for data normalized to MVIC were significantly greater (P< 0.05) than for data normalized to mean-d. Additionally, the ICC for peak-d was significantly greater than the ICC for mean-d. No significance was seen in the correlation values for MVIC vs. peak-d. values
210 220 200
u
z
160
9
160
r 2 g
Y s
140 120 100 60
% 60
40 20 0
5 101520253035404550556065707560659095100 PERCENT
OF CYCLE
PEAK-D 240
ACL
-
DISCUSSION
220
PERCENT
M”,C 240
OF CYCLE
-
ACL
-
220 200 ;
1(10-
2
160-
?
140
? 0 =
120.
Y 5
so-
%
60
100
20 tiiiiiitii,,*,,,.,f
10 0 0
5
IO 15 20 25 30 35 40 45 50 55 50 65 70 75 e.0 85 90 95100 PERCENT
OF CYCLE
FIG. 3. EMG normalized by three different techniqlies for two trials for the anterior cruciate ligament (ACL) deficient subjects. Standard deviation lines are shown only unidirectionally. Light lines directed upwards are for trial 1 data. Darker lines directed downwards are for trial 2 data.
The EMG values normalized to the dynamic contractions are higher than those normalized to the MVIC (Figures 2 and 3) by virtue of the lower number in the denominator of the normalization equation when the dynamic values are used. Further inspection of these figures suggests reproducibility of the balance board task is highest when the MVIC is used; the standard deviation (represented by the light and dark vertical lines on the plots) is less with the MVIC than with the dynamic values. Although absolute difference may not be so great when relative differences between values derived from the two trials are considered, the higher ICC coefficients for MVIC as shown in Table 6, 0.80 in normal and 0.58 in ACL subjects, similarly, and more importantly, support the higher reproducibility with the MVIC compared to peak-d or mean-d. The finding in ACL subjects (Table 6) that the peak-d reproducibility value was closer to the value found for MVIC than to the value for mean-d may reflect neural control factors for this group. Less variability may be possible with peak activity than with typical or average performance. We presented the intersubject CV in a separate table, Table 5, to underscore that intersubject variability is not a statistical measure of reliability. The intersubject CV describes variability of subjects within the group for one data set, in this case, the data from trial 2. We included the measure to compare our results to Yang and Winter’s study35 aimed at suggesting the best normalization procedure. If we were to limit our recommendation to low intersubject CV, our results would agree with Journal
of Elecrromyography
& Kinesiology Vol. 4. No. 1, 1994
L. M. KNUTSON ET AL.
56 TABLE 5.
Median and range (I intersubject coefficients of variation KV) for three methods of normalizing data [maximum voluntary contraction IMVIC), and peaks and mean dynamic contractions (peak-d, mean-d)] for one trial of EMG measurements for balance board activity for subjects with normal and anterior cruciate iigament (ACL) deficient knees Normals MVIC
CV inter %
91.3 (79.8-105.6)
Pea k-d 41.9 (26.3-52.5)
ACL deficient Mean-d
MVIC
37.2 (25.8-54.3)
74.9 148.9-90.3)
Pea k-d
Mean-d
47.6 (26.6-69.1)
38.2 (24.S54.1)
TABLE 6. Median and range (I values for three statistical procedures and three methods of normalizing data [maximum voluntary contraction (MViCI, and peaks and mean dynamic contractions (peak-d, mean-d)] for between trial EMG measurements for balance board activity for subjects with normal and anterior cruciate ligament (ACLJ deficient knees Normals MVIC CV intra % VR ICC
38.1 (27.2-51.5) 0.21 (0.09-0.35) 0.80 (0.66-0.92)
Peak-d 23.8 (17.4-30.1) 0.50 (0.19-0.81) (0.2:?.62)
ACL deficient Mean-d 26.5 (21.5-34.7) 0.35 (0.18-0.93) 0.66 (0.07-0.82)
those of Yang and Winter that the mean dynamic EMG is the best criterion for normalization. Similarly, precision, reflected by a low intrasubject CV, is higher when mean-d or peak-d is used (Table 6). However, reproducibility, reflected by VR and ICC is greatest with MVIC (Table 6). Our results confirm statistical knowledge associating low intersubject variability from one trial (Table 5, mean-d and peak-d) with low reproducibility when the trial is repeated (Table 6, mean-d and peak-d). Thus recommendation for a reference contraction based on one criteria may have a counterproductive effect from an alternative view. Chen and Shiavi4 used the mean EMG as the normalization value based on previous recommendations that low intersubject CV was desirable32,35. We suggest the 1984 and 1987 recommendations may not be desirable from a reproducibility viewpoint. Further, we feel caution should be exercised in seeking ways to reduce intersubject CV. The template or window defined by intersubject CV must adequately reflect the population to avoid chances of falsely identifying deviance, i.e. false positives. Journal of Ekctromyography
& Kinesiofogy Vol. 4, No. 1, 1994
MVIC 41.8 (25.6-69.8) 0.43 (0.13-0.80) 0.58 (0.2GO.89)
Peak-d
Mean-d
31.3 (22.0-51.7) 0.53 (0.31-0.81) 0.52 (0.1 SO.691
29.8 (21 J-46.0) 0.61 (0.34-0.87) 0.39 (0.09-0.67)
The methodological techniques used to collect and analyse the data may have had influencing effects on the data. We used, however, accepted procedures for collecting, transferring, rectifying, and normalizing data. These steps allowed creation of linear envelopes which are now probably the most common form of reduction technique applied to EMG data. Linear envelopes have been used in a wide number of applications4J0,30,35. Because the analysis was completed for only the gastrocnemius, results may be different for other muscles. However, we have no indication of this from a review of the data available for four other muscles. The availability of more than two trials may have strengthened the reliability. For example, three trials have been shown to improve the reliability coefficients34. However, the ICCs ranging from 0.52 to 0.80 (Table 6), with the exception of a 0.39, agree with or are better than previously reported ICCs ranging from 0.52 to 0.61 for within days and across one to an infinite number of trials34. Because we did not choose to collect data so that normalization could be done at other levels of MVIC there is a possibility that the coefficients
NORMALIZATION derived from this study could be improved further with a method using submaximal MVICs. Other reports have favoured the submaximal MVIC over the MVIC34. Results favouring the MVIC hold for the ACL group as well as the normal group. Other comparisons between the normal subjects and those with a verified pathology are possible. Evidence is available showing that persons with orthopaedic disorders can have dysfunction manifested by both changes in temporal sequencing and amplitude of muscular activity as measured by electromyographyzO,**. The present study also confirms differences between the normal group and the ACL group in the magnitude of the intersubject CVs (Table 5) and in the magnitudes and some orders of the intrasubject CVs, VRs and ICCs across the three normalization procedures (Table 6). For the MVIC, the ACL group showed less variability, lower intersubject CV, and subsequently lower reproducibility, lower ICC, than the normal group. Precision also differed between the two groups with higher intrasubject CVs for the ACL group suggesting less precision than seen with the normal group. This latter finding may differ from that for subjects with central nervous system deficit. Kadaba, in unpublished data, reported that intrasubject CV values were similar or better (lower) for a group of children with cerebral palsy compared to a normal group. Findings in several areas of EMG analysis which may be unique by type of disability should be of interest to examiners who apply EMG methodology in diagnosis and evaluation of treatment. This study does not address the difficulty which may be encountered in securing a maximal contraction, such as for persons with neurological dysfunction. Further study is needed to determine if the findings from this study can be applied to other groups or if another standard which enhances reproducibility in that group exists.
57
PROCEDURES on the reproducibility data from this study, conclude that for within day performance, standard for normalization in adult normal orthopaedically impaired subjects should be maximum voluntary isometric contraction. REFERENCES 1. Arsenauh
2.
3. 4.
5.
6.
7. 8.
AB, Winter DA, Marteniuk RG: Is there a ‘normal’ nrofile of EMG activitv in eait? Med & Biol E~P” v & Cornpit 241337-343, 1986. ’ Arsenault AB, Winter DA, Marteniuk RG, Hayes KC: How many strides are required for the analysis of electromyozraohic data in eait? Stand J Rehab Med 18:133-135. 1986. Bartko JJ, Carpenter WT: On the methods and theory of reliability. J Nerv & Ment Db 163:307-317, 1976. Chen JJ, Shiavi R: Temporal feature extraction and clustering analysis of electromyographic linear envelopes in gait studies. IEEE Trans Biomed Eng 371295-302, 1990. De Vries HA:‘Efficiency of electrical activity’ as a physiological measure of functional state of muscle tissue. Am J Phys Med 47:1&22, 1968. Di Fabio RP: Reliability of computerized surface electromyography for determining the onset of muscle activity. Phys Ther 67:4%48, 1987. Francis K: Computer communication. Phys Ther 66:114&1143, 1986. Giroux B, Lamontagne M: Comparisons between surface electrodes and intramuscular wire electrodes in isometric and dynamic conditions. Electromyogr Clin Neurophysiol 30:397405,
1990.
9. Gollhofer A, Horstmann GA, Schmidtbleicher D, Schonthal D: Reproducibility of electromyographic patterns in stretchshortening type contractions. Eur J Appl Physiol 60:7-14, 1990.
10. Graham GP: Reliability of electromyographic measurements after surface electrode removal and replacement. Perceptual & Motor Skills 49:215-218,
This study has evaluated an EMG data set collected from the gastrocnemius muscle during the performance of rotation on a balance board. Data from two consecutive trials were normalized with three techniques, the MVIC, and the peak and mean of the dynamic contraction. Analyses were done by means of four procedures, inter and intrasubject coefficients of variation, the variance ratio, and intraclass correlation coefficient. Based
1979.
11. Hershler C, Milner M: An optimality criterion for processing electromyographic (EMG) signals relating to human locomotion. IEEE Trans Biomed Eng 25:413-420, 1978. 12. Horstmann GA, Gollhofer A, Dietz V: Reproducibility and adaptation of the EMG responses of the lower leg following perturbations of upright stance. Electroencephalogr Clin Neurophysiol
70:447-452,
1988.
13. Jonsson B, Reichmann S: Reproducibility in kinesiologic EMG - investigations with intramuscular electrodes. Acta Morph Neerl Stand 7:73-90,
1968.
14. Kadaba MP, Wootten ME, Gainey J, Cochran GVB: Repeatability of phasic muscle activity: Performance of surface and intramuscular wire electrodes in gait analysis. J Orthop Res 3:35C-359,
SUMMARY AND CONCLUSIONS
we the and the
1985.
15. Kadaba MP, Ramakrishnan HK, Wootten ME, Gainey J, Gorton G, Cochran GVB: Repeatability of kinematic, kinetic, and electromyographic data in normal adult gait. j Orthou Res 7:849-860.
1989.
16. Knuttson E, Richards C: Different types of disturbed motor control in gait of hemiparetic patients. Brain 102%X-430, 1979.
17. Komi PV, Buskirk ER: Reproducibility of EMG measurements with inserted wire electrodes and surface electrodes. Electromyography
4~357-367,
1970.
18. Lippold OCJ: The relation between integrated action potentials in.a human muscle and its isometric tension. J Physiol 117:492-499,
1952.
19. Neumann DA, Cook TM: Effect of load and carry position Journal of Electromyography & Kinesiology Vol. 4, No. 1, 1994
L. M. KNUTSON ET AL.
58
on the electromyographic activity of the gluteus medius muscle during walking. Phys Ther 65:30>311, 1985. 20. Shiavi R, Limbird T, Borra H, Edmondstone MA: Electromyography profiles of knee joint musculature during pivoting: Changes induced by anterior cruciate ligament deficiency. .I Electromvow
Kinesiol 1~48-57, 1991.
21. Shrout PE, Fleiss JL: Intraclasscorrelations: Uses in assessing rater reliability. Psychol Bull 86:420-428, 1979. 22. Sinkjaer T, Arendt-Nielsen L: Knee stability and muscle coordination in patients with anterior cruciate ligament injuries: An electromyographic approach. J Electromyogr Kinesiol 1:209-217,
1991.
23. Snedecor GW, Cochran WG: Statistical Methods, Sixth Edition, The Iowa State University Press, Ames, Iowa, 1967. 24. Soderberg GL, Cook TM, Rider SC, Stephenitch BL: Electromyographic activity of selected leg musculature in subjects with normal and chronically sprained ankles performing on a BAPS board. Phys Ther 71:514-522, 1991. 25. Standards for tests and measurements in physical therapy practice. Phys Ther 71:58%622, 1991. 26. Veiersted KB: The reproducibility of test contractions for calibration of electromyographic measurements. Eur J Appl Physiol 62:91-98,
1991.
27. Viitasalo JT, Komi PV: Signal characteristics of EMG with special reference to reproducibility of measurements. Acta Physiol Stand 93:531-539,
1975.
28. Viitasalo JT, Saukkonen S, Komi PV: Reproducibility of measurement of selected neuromuscular performance variables in man. Electromyogr Clin Neurophysiol 20:487-501,
1980.
29. Websters New Collegiate Dictionary, G & C Merriam Co, Springfield, MA, 1977. 30. Winter DA: Pathologic gait diagnosis with computer-averaged electromyographic profiles. Arch Phys Med Rehabil 65:39>398,
1984.
31. Winter DA, Marteniuk RG: Is there a ‘normal’ profile of EMG activity in gait? Med & Biol Eng & Comput 24:337-343, 1986.
32. Winter DA, Yack HJ: EMG profiles during normal human walking: Stride-to-stride and inter-subject variability. Electroencephalogr
Clin Neurophysiol67:402-411,
1987.
33. Woods JJ, Bigland-Ritchie B: Linear and non-linear surface EMG/force relationships in human muscles. Amer I Phys Med 62:287-299,
1983.
reliability in 34. Yang JF, Winter DA: Electromyography maximal and submaximal isometric contractions. Arch Phys Med Rehabil 64:417-420,
1983.
35. Yang JF, Winter DA: Electromyographic amplitude normalization methods: Improving their sensitivity as diagnostic tools in gait analysis. Arch Phys Med Rehabil 65:517-521, 1984.
APPENDIX A Statistical Measures of Reliability The purpose of reliability studies is to estimate the consistency or reproducibility of observations made on the same subject or experimental unit. In the simplest study of this sort, multiple observations are obtained on a set of subjects. If the number of subjects is represented by n and the number of measurements on each subject by k then each observation can be written mathematically as Joumolof Electromyography & Kinesiology Vol. 4, NO. 1, 1994
yij = xi + cij, where i = 1, 2, . . . IZ and i = 1, 2, . . . k. For interval scale variables, xi is the true value for the ith subject and eji is the error for the jth observation on the ith subject. We assume that each Xi is independent of the next Xi, that the subjects for whom Xi values are observed are randomly selected from a population of subjects, and that the population variance is V(S). This variance is sometimes called the between subjects variance. The error terms eij are also assumed to be independent with variance V(E) called the error variance or the within subject variance. The goal for any measurement technique is to achieve a small error variance V(E); the ideal would be to have this variance equal to zero. Clearly, this is never possible. For interval scale variables one measure of reliability is the correlation between repeated observations made on the same subject. It can be shown that, for this simple model, correlation is given by the formula V(S)/(V(S)+V(E)). This correlation will be 1.0 when V(E) is zero and close to zero when V(E) is large relative to V(S). This correlation is called the intraclass correlation coefficient or ICC. Good reproducibility is indicated when the ICC is near one, and bad reproducibility when it is near zero. The ICC is near one when V(E) is small relative to the total variance (V(S)+V(E)). The ICC can be estimated in several ways but one of the simplest methods is to compute a oneway analysis of variance (ANOVA) with subject number as the factor or class designator. This analysis yields the mean square between subjects MS(B) and the mean square for error MS(E) (also called the mean square within subject). The observed MS(E) is an unbiased estimate of the unknown error variance V(E). The expression (MS(B) - MS(E))/ k-l, where k is the number of replications of the measurement, yields an unbiased estimate of V(S)23. The quantities MS(B) and MS(E) computed from the one-way ANOVA can in fact be used to calculate both the ICC and the VR. The ICC is defined as ICC = V(S)N(S)
+ V(E)
and is estimated by ICC = (MS(B) - MS(E))/(MS(B)
+ (k-l)*MS(E))
The variance ratio (VR) is defined as VR = V(E)/(V(S) and is estimated by
+ V(E))
NORMALIZATION VR = (k*MS(E))I(MS(B)
+ (k-l)MS(E)).
The VR lies between zero and one; values near zero indicate good reproducibility, values near 1 indicate bad reproducibility. Reproducibility is good when the error variance is small relative to the total variance. Note that the VR = 1 - ICC. Values of VR near zero correspond to values of the ICC near 1. Both the ICC and the VR relate the values of V(S) to V(E) and reproducibility is good when the error variance is small relative to the between subject variance. This suggests that when doing reliability studies the magnitude of the between subject variance is crucial. If there is little between subject variability then by definition the ICC is small (near zero) and the VR is large (near 1). Both these measures can be viewed as quantitating the user’s ability to distinguish between subjects. If there is no variation among subjects then it will be impossible to distinguish between subjects. In planning reliability studies it is thus important to attempt to sample from the full population of interest so that the estimated between subject variance will be a good estimate of the true variance in the population of interest. Poor sample selection can lead to underestimating the reproducibility and thus the usefulness of a measurement technique.
ADVERTISE MAKE
PROCEDURES
59
The definitions given in this appendix are all based on a very simple model. The basic assumptions are: subjects are randomly selected, their performances are independent of each other, and the experimental and environmental conditions do not change from one measurement to the next. Many reliability studies are not this simple and do not satisfy these basic assumptions. For example, it may be necessary to acquire observations on the same individual on different days or to have different observers obtain them. While the basic concepts remain the same, the models used to estimate the reliability coefficients must take into account the added sources of variation. In most cases the adjustments are easy to make and it is important to make them. Failure to account for these sources of variation will inflate the estimates of the error variance, and, in most instances, will result in underestimating the true reproducibility of the measures. The procedures for collecting the data in the study described in this paper involved recording data for two trials on the same day and under the same conditions. Because data were directly entered to the computer observer bias was considered eliminated and the assumptions described satisfied.
IN THIS JOURNAL IT WORK FOR YOU
This journal is highly specific and enables companies to advertise their product, service or event to a well defined and attentive audience within a perfectly-tailored editorial environment. Butterworth-Heinemann l l
journals provide:
Precision targeting
Long ‘shelf life’- advertising goes on working, l
Full details
of advertisement
reinforcing
your sales message
High pass-on readership rates,
mechanical
data,
information
on circulation
and copy dates can be obtained from: MTB Advertising, 11 Harts Gardens, Guildford, Surrey GU2 6QA UK Telephone: +44 (01483 578507 Fax: +44 (0) 483 572678
H;;;E~;","J; Journal of Elecnomyography
& Kinesiology Vol. 4, No. 1, 1994