Electroencephalography and clinical Neurophysiology , 86 (1993) 219-223
219
© 1993 Elsevier Scientific Publishers Ireland, Ltd. 0013-4649/93/$06.00
E E G 92658
Individual reliability of amplitude distribution in topographical mapping of EEG Adrian Burgess and John Gruzelier Department o f Psychiatry, Charing Cross and Westminster Medical School, Hammersmith, London W6 (UK)
(Accepted for publication: 11 January 1993)
Summary Whilst there is an accumulation of evidence suggesting that many quantitative EEG parameters show good stability and reliability, no previous study has considered whether the spatial distribution of E E G amplitude is reliable over time within a session. This study reports on the spatio-temporal reliability of E E G using data recorded from 24 subjects in a baseline condition with eyes open and also whilst performing a simple motor task. Both the internal stability and test-retest reliability for electrode parameters were comparable to previously published data. For most individuals, amplitude distribution was stable within each recording condition, but the test-retest reliability after 40 min was less good with the poorest reliability in the delta frequency band. Most subjects showed spatio-temporal reliability of less than 0.7 in at least one frequency band. In contrast, spatio-temporal reliability for the group average was good and exceeded 0.88 in all frequency bands. It is argued that the results indicate that reliability is insufficient to allow topographical comparisons for a single individual, but is more than adequate to allow group comparisons. Key words: E E G reliability; E E G topography
The importance of establishing the reliability of quantitative E E G measures has been increasingly recognised in recent years and a number of reports have shown that absolute and relative power measures are reliable over time. Salinsky et al. (1991) found test-retest reliabilities of greater than 0.8 for both absolute power and relative power in adults with a 12-16 week interval between assessments. Within a single recording session, with a delay of 5 min, most reliabilities exceeded 0.9. Similarly, Pollock et al. (1991) reported mean reliabilities in a group of elderly subjects with a 4-5 month interval of greater than 0.74 for absolute power, though the reliabilities in delta were rather poorer. There is some evidence that reliabilities are maintained over even longer periods of time. Gasser et al. (1985) found mean reliabilities of 0.68 and 0.69 for absolute and relative power respectively in a sample of children with an average 10 month interval between assessments. Again, the lowest reliabilities were in delta, though the reliability of beta I absolute power was also relatively low (0.58). Other reports have addressed the issue of the stability or internal consistency of E E G measures. Oken and Correspondence to: Mr. A. Burgess, Dept. of Psychiatry, Charing Cross and Westminster, Medical School, St. Dunstan's Road, Hammersmith, London W6 (UK). Tel.: 081 746 5641; Fax: 081 746 5515.
Chiappa (1988), for example, found that relatively small variations occur within a single 2 min recording condition, though some individuals showed wide variations which were associated with global power changes, and in particular with increases in alpha. Whilst these data have established the reliability and stability of E E G recordings, there has been no previous report on the spatio-temporal reliability of E E G recordings. Spatio-temporal reliability may be defined as the extent to which the distribution of E E G amplitude over the scalp at different electrode sites remains constant over time given the same recording conditions. If spatio-temporal reliability is high, then a regional difference in E E G power during a given set of recording conditions should be repeatable, providing that the recording conditions are adequately reproduced. The simplest way to measure spatio-temporal reliability is to calculate the correlation between amplitude recorded at different times, but under the same conditions, across all scalp electrode sites. In a similar way, the internal consistency, or spatio-temporal stability may be calculated by, for example, calculating the split-half reliability or Cronbach's a. Although this study addresses only the issue of reliability of E E G amplitude, the approach could also be used with other E E G parameters such as relative power. Spatio-temporal reliability is of importance in determining whether changes in the scalp distribution of
220 E E G parameters are due to genuine changes or to random fluctuations. If the spatial distribution of E E G were unreliable, then attempts to localise E E G changes by experimental manipulation, whether through cognitive activation or pharmacological intervention, would not be possible as any change seen might be due to the instability of the E E G signal. Spatio-temporal reliability might be expected to vary according to the activation task used for recording. An uncontrolled task, such as "eyes open" or "eyes closed," might be expected to show poorer spatio-temporal reliability than a task involving controlled, repetitive activity. The aim of this study is to determine: (a) whether the spatial distribution of E E G amplitude is stable over time and (b) whether different recording conditions show internally consistent topography.
Method and materials
Subjects Twenty-four healthy right-handed subjects were recruited from staff and associates from Charing Cross Hospital. The subjects included 12 men and 12 women with an average age of 25.9 (range 18-39). Subjects with a history of psychiatric illness or central nervous system disease were excluded. Subjects were included in this study if at least 20 sec of relatively artifact-free E E G recordings were obtained in both the activation conditions and at least one of the baseline conditions. Twenty seconds of relatively artifact-free E E G was selected as it was judged that this was the minimum length of time which would give reliable estimates of mean amplitude at each electrode site.
Equipment Recordings were made from 28 scalp sites with a Neuroscience Brain Imager. Electrodes were placed using an electrode cap according to the international 10-20 system plus an additional 8 electrodes: FTC1 and FTC2, equidistant between F7 and C3 and F8 and C4 respectively; TCP1 and TCP2, equidistant between C3 and T5 and C3 and T6 respectively; CP1 and CP2 located equidistant between Cz and P3 and Cz and P4 respectively; PO1 and PO2 equidistant between O1 and Pz, and 0 2 and Pz respectively. Signal bandpass was 0.15-40.0 Hz and the digital sampling interval per channel was 2 msec. Reference was to linked ears.
Procedure Subjects were seated in a quiet, dimly lit room and instructed to relax, avoid movement and to keep alert. In each condition, recordings continued until a minimum of 24 epochs (i.e., 62 sec) of relatively artifact-free E E G had been collected as judged at the time of recording. Each session commenced with a baseline
A. BURGESS,J. GRUZELIER condition, eyes open (EO1), and was followed by two activation conditions and a final eyes open baseline condition (EO2). The time delay between EO1 and EO2 was typically in the range of 40 min to 1 h. The baseline conditions involved the subjects sitting with their eyes open with their attention directed to a cross on the wall directly in front of them. Eyes open was used as a baseline condition because many of the cognitive tasks that it is planned to investigate in future require the subject to respond to visually presented stimuli. The activation conditions involved the subjects performing the Luria Finger Apposition Task (Luria 1980) with either the left or the right hand with their eyes open. Order of hand was randomly decided. The Luria Finger Apposition Task involves the subject touching each of the fingers of one hand with the thumb of the same hand in turn, starting with the index finger and ending with the little finger. Subjects were instructed to repeat the task as rapidly as possible and the rate at which the task was performed and the number of errors made were recorded.
Data analysis The recorded data were edited after recording to allow more careful artifact rejection. This included exclusion of epochs which showed evidence of drowsiness, eye movement or muscle artifact. Frequency analysis of the remaining epochs was performed using a Fast Fourier Transform which calculated the mean amplitude (i.e., square root of the power) for each epoch in 5 frequency bands: delta (0.5-3 Hz), theta (4-7 Hz), alpha (8-12 Hz), beta 1 (13-16 Hz) and beta 2 (17-30 Hz). The resulting data were transferred to an IBM compatible computer for further analysis. At least 20 sec of relatively artifact-free E E G was recorded and analysed in all 4 conditions for 16 subjects. After exclusions due to insufficient relatively artifact-flee epochs, the sample sizes were 21 for both Luria conditions, 20 for EO1, 17 for EO2 and 16 for the EO1-EO2 comparisons.
Results
(i) Internal consistency: spatial distribution of amplitude The internal consistency of each condition for each subject was calculated using Cronbach's a. This measure may be thought of as giving the average of all possible split-half reliabilities (Streiner and Norman 1989) and is calculated by determining the proportion of variance that is not due to differences between epochs. That is, a high alpha suggests that there is a high degree of spatio-temporal consistency between different epochs.
RELIABILITY OF EEG AMPLITUDE
221
TABLE I Showing the m e a n and S.D. of Cronbach's ot averaged across all electrodes for each individual for each condition and each frequency band. Task
Delta
Theta
Alpha
Beta 1
Beta 2
Mean
S.D.
Mean
S.D.
Mean
S.D.
Mean
S.D.
Mean
S.D.
Eyes open 1 Luria left Luria right Eyes open 2
0.92 0.90 0.89 0.92
0.046 0,078 0.095 0,067
0.94 0.94 0.94 0.95
0.042 0.038 0.051 0.034
0.94 0.90 0.89 0.96
0.047 0.079 0.096 0.029
0.92 0,89 0.91 0,93
0.063 0.086 0,062 0.046
0.97 0.97 0.97 I).97
0.020 0.022 0.024 0.018
T A B L E II Showing the standard error of m e a s u r e m e n t (S.E.M.) as an absolute value (/~V) and as a percentage of the mean amplitude averaged across all electrodes for each individual, each condition and each frequency band. Task
Eyes open 1 Luria left Luria right Eyes open 2
Delta
Theta
Alpha
Beta 1
Beta 2
S.E.M.
%
S.E.M.
%
S.E.M.
%
S.E.M.
%
S.E.M.
%
1.1 1.1 1.1 0.8
5.3 5.4 5.1 4.5
0.5 0.4 0.4 0.5
5.1 5.1 5.1 5.1
0.7 0.5 0.4 0.7
6.4 6.4 6.1 5.8
0.3 0.3 0.3 0.2
5.4 6.2 5.6 3.8
0.2 0.2 0.2 0.5
3.7 4.0 4.0 4.9
The mean and standard deviatior~ of the internal reliability for each condition, for each frequency band, as measured by Cronbach's a is given in Table I. The reliabilities are generally high with most greater than 0.8 in each condition suggesting a high degree of internal consistency.
(ii) Internal consistency: amplitude at each electrode For each individual, for each condition, the standard error of measurement for each electrode was calculated as an absolute value (S.E.M./~V) and as a percentage of the mean amplitude (S.E.M.%). The S.E.M. is the standard deviation of the estimate of the mean and may be used to calculate the confidence intervals within which the true mean is likely to be found. The coefficient of variation (CV; i.e., the standard deviation divided by the mean) was also calculated to give a measure of variability which, unlike the S.E.M., is not dependent upon sample size. The mean standard errors of measurement (S.E.M.) averaged across all electrodes for each individual, each frequency band and each condition are presented in Table II. None of the mean S.E.M.% exceeded 7%. In each case the highest mean S.E.M.% was in the alpha frequency band. The mean CV for each frequency band and each task is given in Table III. The average CV values were 0.21, 0.21, 0.26, 0.23 and 0.16 for delta, theta, alpha, beta~ and beta 2 respectively. The CV did not vary substantially between conditions but there was some variation across frequency bands. The lowest values were consistently in beta 2 and with the highest in alpha. These CV values, with the exception of beta 2, are higher than those reported by Oken and Chiappa (1988) for recording with eyes closed and are consistent
with John et al.'s (1983) report that there is a greater degree of variability in the eyes open condition.
(iii) Test-retest reliability: spatial distribution of amplitude The spatio-temporal reliability was estimated by calculating the Pearson Product-Moment Correlation Coefficient between amplitudes at each electrode during conditions EO1 and EO2 for each individual. The rationale for this procedure is that if the spatial distribution of amplitude for an individual is reliable then the amplitude at each electrode should be similar on both occasions and therefore, there should be a high correlation. The spatio-temporal reliability was calculated for each individual and for the group as a whole. The spatio-temporal reliability coefficients between EO1 and EO2 for each individual are given in Table V. The mean scores for the spatio-temporal reliability in each band are above 0.70 with the exception of delta (mean reliability = 0.67) suggesting generally satisfactory reliability of the spatial distribution of amplitude. However, spatio-temporal reliability for most individuals is quite poor when all frequency bands are considered. Only 2 subjects (nos. 12 and 15) show reliabilities T A B L E III Showing the coefficient of variation (CV) average across all individuals for each condition and each frequency band.
Eyes open 1 Luria left Luria right Eyes open 2
Delta
Theta
Alpha
Beta I
Beta 2
0.22 0.21 0.21 0.21
0.22 0.19 0.20 0.24
0.27 0.24 0.23 0.28
0.22 0.23 0.22 0.23
0.15 0.16 0.16 0.18
222
A. BURGESS, J. G R U Z E L I E R
greater than 0.8 in all frequency bands and even if delta is excluded, only 5 subjects meet this criterion. This suggests that for most individual subjects, 20 sec of EEG is not sufficient to produce reliable spatial distribution of amplitude. There was a significant correlation between the number of epochs for each subject and the size of the spatio-temporal reliability in alpha ( r = 0.50, P < 0.025) but for no other band, suggesting that longer samples of EEG would give higher reliabilities, at least in alpha. The spatio-temporal reliability of the group as a whole is, as would be expected, much higher than for any single individual. The reliabilities are 0.90, 0.95, 0.99, 0.99 and 0.88 for delta, theta, alpha, beta I and beta 2 respectively.
(iv) Test-retest reliability: amplitude at each electrode The difference between mean amplitude values during EO1 and EO2 at each electrode for each individual was calculated as a percentage of the mean amplitude of the two conditions. The mean percentage differences for each frequency band averaged across all electrodes for each individual, for each frequency band were 12.6%, 7.7%,
T A B L E IV Showing the test-retest reliability for each electrode in each frequency band. Electrode
Delta
Theta
Alpha
Beta 1
Beta 2
Fpl Fz Cz Pz Oz F3 C3 P3 O1 F7 T3 T5 FTC1 TCP1 CP1 PO1 FP2 F4 C4 P4 02 F8 T4 T6 FTC2 TCP2 CP2 PO2
0.59 0.75 0.72 0.72 0.70 0.64 0.72 0,82 0.76 0.73 0.88 0.56 0.85 0.73 0.38 0.71 0.65 0.76 0.34 0.80 0.61 0.72 0.53 0.87 0.74 0.76 0.47 0.52
0.88 0.86 0.85 0.81 0.85 0.86 0.87 0.83 0.84 0.85 0.90 0.91 0.88 0.82 0.78 0.82 0.88 0.91 0.85 0.85 0.83 0.88 0.90 0.92 0.86 0.94 0.75 0.77
0.79 0.79 0.92 0.95 0.91 0.81 0.92 0.94 0.86 0.79 0.93 0.91 0.94 0.95 0.91 0.94 0.78 0.82 0.95 0.96 0.89 0.81 0.93 0.94 0.92 0.98 0.95 0.94
0.90 0.91 0.93 0.91 0.93 0.95 0.96 0.92 0.88 0.88 0.88 0.87 0.90 0.93 0.80 0.92 0.87 0.90 0.88 0.92 0.91 0.87 0.67 0.90 0.85 0.85 0.87 0.83
0.83 0.94 0.87 0.91 0.85 0.97 0.92 0.88 0.74 0.92 0.90 0.76 0.85 0.91 0.84 0.87 0.85 0.92 0.92 0.91 0.89 0.90 0.42 0.91 0.89 0.84 0.85 0.78
Mean
0.70
0.86
0.91
0.90
0.88
TABLE V Showing the topographical reliability for each subject for each frequency band. Subject no.
Delta
Theta
Alpha
Beta 1
Beta 2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0.55 0.72 0.45 0.63 0.42 0.66 0.93 0.46 0.65 0.62 0.96 0.86 0.51 0.68 0.92 0.72
0.83 0.92 0.87 0.66 0.56 0.91 0.93 0.75 0.97 0.97 0.97 0.91 0.44 0.63 0.96 0.75
0.97 0.97 0.78 0.52 0.97 0.95 0.99 0.97 0.93 0.99 0.79 0.85 0.81 0.72 0.99 0.61
0.73 0.88 0.64 0.65 0.57 0.88 0.93 0.89 0.85 0.92 0.71 0.94 0.54 0.38 0.92 0.75
0.77 0.93 0.75 0.93 0.46 0.97 0.50 0.93 0.79 0.88 0.77 0.88 0.37 0.54 0.89 0.60
Mean
0.67
0.81
0.86
0.76
0.75
10.9%, 4.4% and 5.4% for the delta, theta, alpha, beta 1 and beta 2 bands respectively. Three individuals showed mean percentage differences greater than 20%, one in delta and two in alpha. The mean percentage differences for delta, theta and alpha are comparable to the median percentage difference reported by Salinsky et al. (1991) for recording with eyes closed, whilst the beta percentage differences are somewhat lower. Paired t tests were also performed between electrode amplitudes at EO1 and EO2 for each individual in order to determine whether the differences were statistically significant. With 28 electrodes, 5 frequency bands and 16 subjects, a total of 2240 t tests were performed. In all frequency bands, more statistically significant results were found than would be expected by chance. The percentages of significant t tests at P < 0.01 were 25%, 14%, 19%, 14% and 33% for the delta, theta, alpha, beta I and beta: frequency bands respectively. The test-retest reliability of electrode amplitudes for the whole group was calculated for EO1 and EO2 using Pearson's Product-Moment Correlation Coefficient. The test-retest reliability of each electrode is reported in Table IV. The mean values across all electrodes were 0.70, 0.86, 0.91, 0.90 and 0.88 for delta, theta, alpha, beta1 and beta 2 respectively. As has previously been noted (Salinsky et al. 1991; Pollock et al. 1991), delta shows the lowest levels of reliability. Excluding delta, only one electrode showed test-retest reliability of less than 0.7 which was T4 in the beta bands which most probably reflects some residual muscle activity. The electrode reliabilities reported here are of similar magnitude to those reported elsewhere (Salinsky et al. 1991). Salinsky et al. (1991) reported
RELIABILITY OF EEG AMPLITUDE
reliabilities using the Spearman correlation coefficient of 0.95, 0.90, 0.91, 0.95 and 0.95 for delta, theta, alpha, beta 1 and beta 2 respectively, for eyes closed with a 5 min interval and slightly lower reliabilities over a 12-16 week interval. Both Gasser et al. (1985) and Pollock et al. (1991) reported somewhat lower reliabilities, albeit over a much longer test-retest interval. This study gives further support to the reliability of quantitative EEG, at least over short periods of time. Paired t tests were calculated to determine the significance of differences between the group mean amplitudes at each electrode site at EO1 and EO2. For theta, beta 1 and beta 2 there were no statistically significant differences at P < 0.01 and in the case of delta, only two were seen (FTC1 and TCP2). For alpha, 5 electrodes were statistically significantly different at P < 0.01 which, assuming the electrodes are independent, is greater than would be expected by chance. For all bands except delta, there was a strong tendency for the amplitudes to be higher in the second "eyes open" condition with 89% of the electrodes showing greater amplitudes in EO2. For delta, the position was reversed and all but one electrode shoyeed higher delta amplitudes in EO1.
Discussion
Spatio-temporal consistency within conditions for individual subjects was above 0.8 for all bandwidths and was similar for the various conditions. On the other hand, the spatio-temporal retest reliability between the two eyes open conditions was less satisfactory. Whilst many subjects showed good spatio-temporal reliability on at least some frequency bands, few were adequately reliable across the spectrum. Further, there were large numbers of statistical differences between electrode amplitudes in each individual case which showed that E E G amplitude was not stationary (i.e., the mean varied over time). The fact that the baseline condition (i.e., eyes open) is cognitively unstructured may in part be responsible and it will be of interest in the future to investigate cognitive tasks for retest reliability. Longer recordings in each condition or statistical co-variation for global amplitude changes may also help to overcome these difficulties. Spatio-temporal reliability for the group as a whole, in contrast, was more than acceptable with reliabilities of 0.90 and above for all but delta. This supports the view that localisation of E E G differences under experimental manipulation may be viable for group comparisons. Turning to amplitude reliability, the internal consis-
223
tency was similar for the various conditions, and coefficient of variation values were similar to those reported by others, as were bandwidth differences (John et al. 1983; Oken and Chiappa 1988). Within individual testretest amplitude reliability for the eyes open conditions was consistent for frequency bands averaged across electrodes with the report of Salinsky et al. (1991). When individual electrodes were examined the proportion of statistically significant differences seen was much higher than that reported by Salinsky et al. (1991) of 8% for a test-retest interval of 5 rain using the Wilcoxon test and closer to their proportion reported after 12-16 weeks of 25% in absolute power. Considering group retest amplitude reliability of the E E G results was broadly comparable with previous reports and confirms that E E G amplitude measures are stable both within conditions and over time. Not only were the spatio-temporal reliabilities good, there were few differences in amplitude over time except in alpha. Overall, this study gives further support to the idea that E E G is reliable and stable, not only in terms of electrode amplitude, but in terms of the spatial distribution of E E G amplitude across the scalp. However, the spatio-temporal reliability is not sufficiently high, at least using the methods reported in this study, to make topographical comparisons between different cognitive conditions viable for an individual. In contrast, the high spatio-temporal reliability for the group as a whole suggests that group comparisons could be used in this context.
References Gasser, T., B~icher, P. and Steinberg, H. Test-retest reliability of spectral parameters of the EEG. Electroenceph. clin. Neurophysiol., 1985, 60: 312-319. John, E.R., Prichep, L., Ahn, H., Easton, P., Fridman, J. and Kaye, H. Neurometric evaluation of cognitive dysfunction and neurological disorders in children. Prog. Neurobiol., 1983, 21: 239-290. Luria, A.R. Higher Cortical Functions in Man (2nd Edition). Basic Books, New York, 1980. Oken, B.S. and Chiappa, K.H. Short term variability in EEG frequency analysis. Electroenceph. clin. Neurophysiol., 1988, 69: 191-198. Pollock, V.E., Schneider, L.S. and Lyness, S.A. Reliability of topographic quantitative EEG amplitude in healthy late-middle-aged and elderly subjects. Electroenceph. clin. Neurophysiol., 1991, 79: 20-26. Salinsky, M.C., Oken, B.S. and Morehead, L. Test-retest reliability in EEG frequency analysis. Electroenceph. clin. Neurophysiol., 1991, 79: 382-392. Streiner, D.L. and Norman, G.R. Health Measurement Scales: a Practical Guide to their Development and Use. Oxford University Press, Oxford, 1989.