Optimal smoothing of coherence estimates

Optimal smoothing of coherence estimates

Electroencephalography and clinical Neurophysiology, 1991, 80:194-200 © 1991 Elsevier Scientific Publishers Ireland, Ltd. 0168-5597/91/$03.50 ADONIS 0...

648KB Sizes 0 Downloads 41 Views

Electroencephalography and clinical Neurophysiology, 1991, 80:194-200 © 1991 Elsevier Scientific Publishers Ireland, Ltd. 0168-5597/91/$03.50 ADONIS 016855979100076B

194

EVOPOT 90053

Optimal smoothing of coherence estimates R o b e r t A. D o b i e 1 a n d M i c h a e l J. W i l s o n Otolaryngology-Head and Neck Surgery, and Virginia Merrill Bloedel Hearing Research Center, University of Washington, Seattle, WA (U.S.A.) (Accepted for publication: 23 July 1990)

Summary We have previously described the usefulness of the magnitude-squared coherence function in analysis of auditory evoked potentials (AEPs) (Ear Hear., 1989, 20: 2-13). For each frequency of interest, the coherence value can be compared to a 'critical value' to determine whether a response is present. Coherence functions must be smoothed across either multiple subaverages or adjacent frequencies (or both) to be reliable, but there are trade-offs: increasing the degree of smoothing increases computational time and (in the case of frequency smoothing) reduces spectral resolution. Using AEPs to clicks and amplitude-modulated tones we investigated the effects of variable degrees of smoothing on threshold estimates in 10 normal human subjects. Thresholds were found to be lower for coherence estimates than for visual detection and are also lower for longer data collection periods. However, there appears to be little if any advantage to segment smoothing beyond 8-16 subaverages. The optimal degree of frequency smoothing is more difficult to specify, depending on the spectrum of the AEP being analyzed (especially the rate of change of phase), and the spectral resolution of the analysis system. Key words: Auditory evoked potentials; Coherence

Magnitude-squared coherence is a measure of the proportion of power (in a system's output) which is attributable to a particular input, as a function of frequency, and is defined as follows: 72y(f) =

IGxy(f) 12 G~x(f)Gyy(f)

the signal-to-noise power ratio in the system output; specifically,

y2(f) (1)

where, ~,2y(f) is the magnitude-squared coherence function relating input x to output y, Gxy(f) is the crossspectral density function relating x and y, and Gxx(f) and Gyy(f) are the power spectral density functions of x and y respectively. For periodic inputs, coherence can be calculated very easily using only output data; for each frequency, coherence is equal to the power of the grand (ensemble) average divided by the average power in the subaverages. Details regarding computation can be found in equations (10)-(16) of Dobie and Wilson (1989). Coherence ranges from 0 to 1 and is closely related to

1 This research was supported by a grant from NIDCD (No. NS23116). A preliminary version of this paper was presented at the 13th Midwinter Research Meeting of the Association for Research in Otolaryngology, St. Petersburg, FL, February 4-8, 1990.

Correspondence to: Robert A. Dobie, M.D., Otolaryngology-Head and Neck Surgery, University of Washington, RL-30, Seattle, WA 98195 (U.S.A.).

SNR2(f) SNR2(f) + 1

(2)

Coherence functions are usually estimated beginning with time-domain measures of system input and output, but if only a single pair of time series is available (an input and an output wave form), estimated coherence for each frequency will always be 1. This spurious result is analogous to calculating a correlation coefficient from a single pair of observations. Useful coherence estimates must be smoothed in one or both of two ways. Segment smoothing requires that multiple output wave forms be collected; in applications such as evoked potentials where response wave forms are typically averaged, this is accomplished by breaking up the total data set into multiple subaverages. Frequency smoothing combines spectral estimates from bands of adjacent frequencies prior to calculation of coherence. The greater the degree of smoothing (the product of the number of segments, q, and the number of frequencies, l), the greater the accuracy of the coherence estimates will be. We have previously discussed the calculation and statistical distribution of coherence estimates, as well as the application of this method to evoked potential studies (Dobie and Wilson 1989; Tucci et al. 1990). The main use of coherence for evoked potentials, or any

OPTIMAL COHERENCE SMOOTHING

195

other type of evoked response (e.g., otoacoustic emissions), is the detection of non-zero coherence for frequencies of interest. For a given degree of smoothing, critical values can be specified for desired confidence levels; for example, when true coherence = 0 (no signal, only noise), 99% of coherence estimates smoothed with ql = 16 will be less than 0.266. Amos and Koopmans (1963) and Carter et at. (1973) have offered theoretically based distributions of coherence estimates. In an earlier paper (Dobie and Wilson 1989), we empirically tested these against coherence estimates obtained under nostimulus conditions (true coherence = 0). The distributions of Amos and Koopmans were found to match the empirical data quite well; those of Carter et al., based on an assumption of normal distribution, were slightly too liberal, i.e., their use would lead to more type I statistical errors than expected. At that time, we were unaware of the work of Brillinger (1978), who gave a very simple expression for coherence critical value, which agrees (within < 1%) with Amos and Koopmans' tables: critical valu% = 1 - (1 - a) 1/(q- x)

(3)

Estimates above this critical value can be accepted as indicating genuine system response. In Dobie and Wilson (1989), we also showed empirically that frequency smoothing had the same effect on distributions of coherence estimates as an equivalent amount of segment smoothing, i.e., the critical value was dependent on the ql product. This agrees with Bendat and Piersol (1986), who base their calculations of confidence intervals for coherence on the ql product. Thus, we propose that it is simple and appropriate to use Brillinger's equation (above), with ql substituted for a, to obtain critical values for coherence estimates. For these studies, we define 'optimal smoothing' as the degree of smoothing (q a n d / o r l) which yields the lowest response threshold for a fixed data collection time. Threshold is the lowest stimulus intensity for which a statistically significant coherence estimate is obtained. For both segment and frequency smoothing, there are trade-offs which make it difficult to specify the optimal degrees of smoothing on theoretical grounds. In the case of segment smoothing, the greater the number of subaverages, the greater the accuracy of estimation will be, and the critical values will decline. However, this desirable outcome is balanced by the fact that, for a fixed total number of responses (or data collection time), each subaverage will contain fewer responses and thus a lower signal-to-noise ratio and a lower coherence value. All other things being equal, a lesser degree of segment smoothing would be preferable, since computational time increases with q. Increased frequency smoothing also increases accuracy and lowers critical values, but at the expense of

frequency resolution. Obviously, for a very narrow-band evoked potential, such as the 40 Hz steady-state response, smoothing across multiple frequencies would be detrimental; the bandwidth of the evoked response being measured sets an upper limit on the smoothing bandwidth. Frequency smoothing also introduces a potential problem of phase smearing. The numerator of the complex coherence function (see Dobie and Wilson 1989) is the cross-spectral density function. When cross-spectral estimates from a group of adjacent frequencies are vectorially added, the magnitude of the resulting vector will approach zero if the phases are widely dispersed. An evoked potential peak at zero time will have energy within a certain frequency band, with all phases in that band tightly clustered. However, a delay (latency) in the time domain introduces a slope to the phase-frequency function which is proportional to the delay. If one knows the time delay (latency) of the response peak to be identified, one can correct this problem by a compensatory time shift to place the peak at zero latency. However, in a real-world evoked potential setting, this solution is at best partial, since we can only approximate the expected latency. In speaking of smoothing across groups of ! frequencies, we have borrowed the terminology of Bendat and Piersol (1986) for conventional frequency smoothing prior to calculating coherence. Alternatively, we can ( post hoc) simply average coherence estimates for groups of ' m ' adjacent frequencies, avoiding the phase-smear problem, since coherence estimates are real (not complex) numbers. While '/-smoothing' reduces the variance of coherence estimates by a factor of ,12, when true coherence = 0 (Carter et at. 1973), we would expect 'm-smoothing' to reduce variance by a factor of only m, since the variance of a mean is smaller than the variance of a point estimate by a factor equal to the number of estimates averaged. In this paper, we report the results of experiments designed to: (1) determine the optimal number of subaverages (q) for estimating response threshold to 40 Hz amplitude-modulated following response (AMFR), in normal human adult subjects; (2) simulate the 'phasesmear' problem which makes '/-smoothing' difficult when a response is delayed in time; (3) demonstrate phase-smearing in coherence functions for auditory brain-stem responses, as well as the efficacy of introducing a compensatory time shift; and (4) empirically test our theoretical predictions of critical values for 'msmoothed' coherence functions. Methods

Subjects Subjects were young adults (21-33 years old) with normal hearing (15 dB H L or better at octave frequen-

196

cies from 250-8000 Hz, ANSI 1970). Experiment no. 1 used 10 subjects; one subject provided the demonstration data in experiment no. 3, while another provided the 'no-stimulus' data in experiment no. 4.

Stimuli Clicks and amplitude-modulated tones were digitally synthesized (for details, see Wilson and Dobie 1987). The clicks were produced by outputting positive pulses at the beginning of each stimulus cycle, while AM tones (500 Hz carrier, 100% amplitude-modulated at 40 Hz) were synthesized by adding carrier and sideband sinusoids (see Dobie and Wilson 1989). Evoked potentials were analyzed with a 256-point F F T (fundamental = 20 Hz, fmax = 2540 Hz). The simulation of experiment 2 used an octave-band pseudorandom noise synthesized in the frequency domain by addition of 32 sinusoids of equal amplitude with frequencies between 1611 and 3125 Hz; these were the 33rd through 64th harmonics of a fundamental frequency of 48.8 Hz. Recording and analysis Coherence was calculated using equations listed in Dobie and Wilson (1989). Since the stimuli were all periodic, equation (16) was used in most cases (coherence = ratio of ensemble average power to average power in the subaverages). However, when frequency smoothing was carried out prior to coherence computation, equations (10)-(14) were used, requiring complex averaging of cross-spectral estimates. Critical values were obtained from equation (3) in this paper (after Brillinger, with ql substituted for q), except in the case of post hoc frequency smoothing, where a new expression, described in Results, was used. Subjects in experiments 1, 3 and 4 reclined in an audiometric test booth but remained awake. Gold cup electrodes were attached to alcohol-cleansed skin at vertex (positive), right mastoid (negative) and left mastoid (ground). Impedances were _< 2 kI2 at 30 Hz. Scalp potentials were amplified (t05), and filtered (101000 Hz, 6 dB/octave). In experiment 1, the AM tone was delivered to all subjects at intensities of 10, 20, 30, and 40 dB (re: 0 dB n H L = 13 dB SPL) for a long data collection period (4096 sweeps of the stimulus period = 204.8 sec/run), and at intensities of 30, 40, 50, and 60 dB n H L for a short data collection period (1024 sweeps = 51.2 sec). Each run was initially collected as 64 subaverages (each of which contained the average of 16 sweeps for the short run, or 64 sweeps for the long run), and coherence was calculated for 40 Hz in the scalp potential. The data were then progressively collapsed into smaller numbers of subaverages (32, 16, 8, 4), and finally into two subaverages, each of which contained 512 (short) or 2048 (long) sweeps. Additional levels were presented as

R.A. DOBIE, M.J. WILSON

needed to each subject in order to pinpoint (to the nearest 5 dB) 2 border conditions: (1) the highest level at which significant 40 Hz coherence was found at no value of q, and (2) the lowest level at which significant 40 Hz coherence was found at all values of q. Group coherence-intensity functions were plotted for each level of q. Finally, 40 Hz coherence thresholds - - the lowest intensities at which coherence estimates exceeded critical values - - were determined for each value of q and compared to mean thresholds judged by visual inspection of AEPs of each of the 10 subjects. The pseudorandom signal in experiment 2 was mixed with an analog white noise ( S / N approximately - 4 0 dB), then fed back into the computer for A / D conversion and averaging (16 subaverages, 32 sweeps per subaverage). Off-line, the circular cross-correlation function was calculated between the stimulus and the grand average: first, the fast Fourier transform was calculated for both the input and output wave form; second, the cross-spectral density function was calculated; and finally, an inverse Fourier transform yielded the circular cross-correlation function (for details, see Dobie and Wilson 1984). Coherence functions were calculated both without frequency smoothing and with varying degrees of frequency smoothing (l = 2, 4, 8, 16, 32). A system delay could be simulated by simply moving the stimulus wave form to the left (prior to calculation of cross-correlation and coherence), as if advancing it in time. For each of 3 simulated delays (1.28, 2.56, and 5.12 msec), the cross-correlation and coherence functions (smoothed and unsmoothed) were re-calculated using the same raw data. In experiment 3, we collected scalp potentials from a single subject who listened to clicks (20/sec, 20 dB SL). Sixteen subaverages were collected (256 sweeps per subaverage), and coherence functions were calculated, with and without frequency smoothing (l = 4). In addition, coherence functions were recalculated after a time shift to compensate for wave V latency, and post hoc smoothing (m = 4) was also carried out. Experiment 4 required only the collection of scalp data with no stimulus present. True coherence was then 0, and 1016 coherence estimates were calculated, at each of several levels of q (4, 8, 16, 32, 64). As in a previous paper (Dobie and Wilson 1989), the 99th percentile in this distribution was measured. The sets of coherence estimates were then averaged in blocks of varying size (m = 2, 4, 8, 16, 32, 64) and the 99th percentile of the resulting distributions compared to theoretical predictions.

Results

Fig. 1 shows replicated time-domain responses of one subject to the AM tone at various intensities, for both

OPTIMAL C O H E R E N C E S M O O T H I N G

197

70-

dB

dB nHL

60

i

-IC

..... -/t..... N = 1024 •& N = 4096

60-

T /

m 50-

.

=

40

"0

>. 4 0 ¢n 3O

Z U,.I

~_ 2 0 z --

4O

30-

10

4

8

1'6 3'2 6'4 '> k~vlsual

NO. O F S U B A V E R A G E S (q) Fig. 3. Group mean (10 subjects) coherence thresholds as a function of q for 2 recording durations, and mean threshold judged by visual examination of averaged time-domain AEPs. Error bars indicate 1 S.D.



Fig. 1. Intensity series of averaged time-domain responses, with replication, of one subject to an AM stimulus (500 Hz carrier with 40 Hz modulation). The left panel displays responses to the 'short' data collection period, and the right shows responses to the 'long' period. The 40 Hz response, with peaks separated by about 25 msec, is clearly seen at higher intensities. Additional responses (not shown) were obtained at 5 dB steps as needed to establish threshold.

the long and short data collection periods. Especially at higher intensities, the 40 Hz response, with peaks separated by about 25 msec, is apparent. The data collected for each intensity level for each subject was sequentially analyzed to yield 40 Hz coherence for several levels of q. These coherence values are plotted in Fig. 2. Recall that N ( = 1024 for short run, = 4096 for long run) is the total number of sweeps, not the number of sweeps 1.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

"='

0'61

t .

0.0

.

.

.

.

...............

.

........

.

.

.

.

.

.

q--:-;-

0.6 0

0"41

1.0~

~ °81

"i"

q = 8

.~~.....

~

0.6

~ °

04 . . . . . . . . . . . . . . . . 012 00 0

10 20 30 40 50 60 70 INTENSITY (dB nHL)

q=64

!----~~----. 0

10 20 30 40 50 60 70 INTENSITY (dB nHL)

Fig. 2. Group mean (10 subjects) 40 Hz response coherence-intensity functions for 2 recording durations. The horizontal dashed lines indicate the critical value ( P < 0.01) at each value of q. Error bars indicate 1 S.D. N = total number of sweeps =1024 or 4096; n = number of sweeps per subaverage = N / q .

per subaverage. Thus, the open triangles in Fig. 2 all represent average coherence values for 51.2 sec data collection periods (1024 × 50 msec/sweep). As expected, coherence rises with intensity and is greater for the long data collection period for any intensity level. Both coherence levels and critical values decline as q (the number of subaverages) increases. For q = 2, the critical value is so high that very few coherence estimates can exceed it, even for high intensities. At q = 8 and higher, average coherence values for 30 dB (short run) and 20 dB (long run) are close to critical value. Coherence thresholds and visual detection thresholds are plotted in Fig. 3. For both the short and long data collection periods, thresholds decline with increasing q, then plateau for q above 8, with coherence thresholds more sensitive than visual detection thresholds. Fig. 4a shows the cross-correlation function for the pseudorandom signal without any delay. There is a sharp correlation peak at ~-= 0. The remainder of Fig. 4 (b, c, d) shows cross-correlation with various delays added. Just as the cross-correlation function estimates the impulse function of a system (assuming a spectrally flat input), the Fourier transform of the cross-correlation function estimates the system's transfer function. The phase plots of the Fourier-transformed functions of Fig. 4 are shown as Fig. 5, with the frequency region of the signal shaded. With zero delay, there is a strong phase clustering within this band (at the upper end, near 3 kHz, several phase values 'wrap around' from - 1 8 0 o to + 180 ° ). However, with increasing time shift of the cross-correlation function, representing increasing system delay, a progressively greater phase-frequency slope is seen. I f one were averaging cross-spectral density functions for sets of 4 adjacent frequencies, a delay of 5.12 msec would lead to cancellation (' smearing') due to rapid phase change. Coherence functions for the data of Figs. 4 and 5 are shown as Fig. 6. The unsmoothed function at the top ( 1 = 1) strongly suggests that there is significant coherence between 1.5 and 2.5 kHz, with a single (prob-

198

R.A. DOBIE, M.J. WILSON

TIME SHIFT (msec)

,..d. t,,r,

0 o

rVlll' I "r"

,r

""

ably spurious) peak just below 5 kHz. When this function is smoothed (l = 4), the spurious peak disappears and the significant region of the function (now identical to the bandwidth of the signal, 1.6-3.1 kHz) is sharply set off from surrounding spectral regions. Note that the critical value is much lower (0.072 vs. 0.266 for the unsmoothed function). However, the dashed and dotted curves on the same graph show the deleterious effect of frequency smoothing when the system's impulse response contains a delay: the significant part of the coherence function declines until, for a 5.12 msec delay, there are no significant coherence values. When a greater degree of smoothing is chosen (l = 16), even a 1.28 msec delay destroys the coherence function. Experiment 3 demonstrated that the problem shown in Figs. 4 - 6 can be at least partially overcome if one

0,5" LU

5.12

,

,

,

.

.

0

.

,

.

,

4

,

,

.

8

,

.

.

.

.

.

12

,

,

16

O z uJ re UJ -1O O

,~=1

0.4 0.3

0.2i 0.1

,

0.0

20

TIME ( m s e c ) Fig. 4. Cross-correlation functions for a narrow-band pseudorandom signal without any delay (a), and with various delays added. Note that the functions are identical except for being 'wrapped' by an amount equal to the delay.

0.25

.~ = 4

'msec'° /

~=16

TIME SHIFT 1

uJ 0 . 2 0 O

TIME SHIFT 1

Z

111 0 . 1 5 111 0 . 1 0

0 0

0.05

0.00 TIME SHIFT (msec~ - .........................

w l e 0 .............. :. . . . . . . .

< "1"

00-

0

n .180o

1.28

0.15

(ms__.~_ec)0 J

I.U

0 z 0.10 LU nuJ

..........

1.28~

•1- 0 . 0 5

0 2,56

0.00 0

~ ,

0

.

5.12 ,



,

1

.

i



,

2

.

,



,

3

.

i

.

,



,

.

4

,

5

.

,



,



6

FREQUENCY (kHz) Fig. 5. Phase spectra of the cross-correlation functions in Fig. 4. The shaded area indicates the bandwidth of the pseudorandom stimulus. Note the increased rate of phase change as a function of longer delays.

2 4 6 F R E Q U E N C Y (kHz)

Fig. 6. Coherence functions of data shown in Figs. 4 and 5. The top trace is unsmoothed. Limited smoothing of the cross-spectra of the subaverages yields coherence functions (middle panel) in which the pattern is enhanced within the bandwidth of the pseudorandom noise stimulus, but increasing time shifts degrade it. Wider frequency band smoothing (bottom panel) is degraded more rapidly by time shifting. Horizontal lines indicate the critical value (P <0.01) for each ql condition. The number of subaverages (q =16) and the number of sweeps per subaverage (n = 32) was the same for each panel; the total number of sweeps (N = qn) was 512.

OPTIMAL COHERENCE SMOOTHING

199

k n o w s the system delay. Fig. 7 shows a subject's scalp response to 20 H z clicks: b o t h an A B R (waves I I I a n d V) a n d M L R (wave Pa) are clearly a p p a r e n t . T h e uns m o o t h e d c o h e r e n c e function (Fig. 8, top) shows signific a n t coherence in the 100 H z a n d 500 H z regions; however, c o n v e n t i o n a l frequency s m o o t h i n g ( l = 4, m i d dle panel, solid line) results in a total loss of coherence. If the stimulus wave form is t i m e - s h i f t e d relative to the subaverages (or vice versa) b y a n a m o u n t equal to the l a t e n c y of wave V, the p h a s e - f r e q u e n c y slopes of the transfer function b e c o m e m o r e shallow, a n d the resulting coherence f u n c t i o n not o n l y retains significant coherence b e l o w 500 H z ( d a s h e d line), b u t shows a clearer s e p a r a t i o n b e t w e e n c o h e r e n t a n d n o n - c o h e r e n t spectral regions t h a n the u n s m o o t h e d function. O f course, w i t h o u t k n o w i n g how m u c h c o m p e n s a t o r y shift to introduce, we c a n n o t r e p a i r ' s m e a r e d ' c o h e r e n c e functions. A n alternative is to average c o h e r e n c e values after they have b e e n calculated, which we have called p o s t hoc or ' m ' - s m o o t h i n g . W h e n the u n s m o o t h e d coherence function at the top of Fig. 8 is s m o o t h e d in this fashion (Fig. 8, b o t t o m panel, m = 4), the result seems to be not as g o o d as the time-shift c o r r e c t e d function for 1 = 4 ( d a s h e d line), b u t it is clearly m u c h b e t t e r than the ' / ' - s m o o t h e d version w i t h o u t shift correction, a n d is also b e t t e r t h a n the u n s m o o t h e d coherence function• T h e critical value for this function has been set at 0.135 on theoretical grounds, as described - - a n d e m p i r i c a l l y tested - - below. C a r t e r et al. (1973) p r o p o s e d that for ql> 32, the bias and variance of coherence estimates in the a b s e n c e of any signal (true coherence = 0) c o u l d be very s i m p l y stated: 1 bias = _~ variance =

(4) 1

(5)

(ql) 2

If critical value is equal to bias plus z s t a n d a r d deviations, d e t e r m i n e d b y a desired significance level c~, then critical valu G

l+z

ql

(6)

4p..~-V

~

Pa

0.67

,¢. = 1

0.511

m=l

0.3

___

0.26__6

0.1 0.0 = 4

s,-,,,=-r1

0.3LU

oz uJ

0.2-

"I" 0 0

0.1

nLU

i,

L ......

8.40J

,7', 0.072

' ~ _ 0.0

O.

m=4

o.

0.2

0.135

°Ol t 0.0 0.0

0.5 1.0 1.5 2.0 FREQUENCY (kHz)

2.5

Fig. 8. Effects of smoothing on 'latency-corrected' AEPs. Top: unsmoothed coherence function from grand average AEP in Fig. 7. Center: frequency smoothing (l) leads to degraded coherence due to rapid rate of phase change, but significant regions are restored by time shifting (equal to latency of wave V). Bottom: post hoc smoothing of the unshifted coherence function shown in top panel; significant response regions are not affected by phase change, so m-smoothing enhances contrast without requiring time shifting prior to calculation of the coherence function (q = 16; n = 256: N = qn = 4096).

If we elect to s m o o t h c o h e r e n c e values after they have b e e n c a l c u l a t e d ( m > 1), rather t h a n s m o o t h i n g crossspectral e s t i m a t e s ( l > 1), the bias of the resultant averaged c o h e r e n c e values will be u n c h a n g e d b u t the variance s h o u l d be variance -

1

(7)

q2m

and critical v a l u e ,

V

10 msec

Fig. 7. Averaged AEP from one subject to 60 dB nHL clicks at 20/see. ABR waves III and V are evident, as is wave Pa of the MLR.

qVr~

(8)

T h e d i s t r i b u t i o n s of m - s m o o t h e d c o h e r e n c e estimates collected in the a b s e n c e of a n y stimulus ( e x p e r i m e n t 4) a g r e e d relatively well with this p r e d i c t i o n (see Fig. 9). E m p i r i c a l 99th p e r c e n t i l e s were lower t h a n p r e d i c t e d for

200

R.A. DOBIE, M.J. WILSON

0.7"

0.6 LLI

m ~'0.5 < >~0.4 ..jne <"'0.3" OZ ~ 8 0.2" ,.I

A



~ t ~

• .......8 (A)

0.1 0.0

.

.

.

.

.

.

.

.

,

10 m

.

.

.

.

.

.

.

.

i

100

Fig. 9. Distributions of post hoc m-smoothed coherence estimates for different values of q. Solid lines are theoretical, using our modification of the formula of Carter et al. (1973). Scatterplots indicate empirical data obtained from 'no-stimulus' recordings.

q = 4, higher for q = 8 and 64, and about the same for q = 16 and 32.

Discussion

Coherence estimates must be smoothed across multiple 'segments' (subaverages), frequencies, or both. For the purpose of identifying significant coherence near threshold, our data suggest that 8-16 subaverages will suffice. Larger numbers of subaverages increase computational time without appreciably improving response threshold. Determining the optimal degree of frequency smoothing is more complex. Conventional frequency smoothing (which requires complex averaging of crossspectral estimates for adjacent frequencies) requires a very fine frequency resolution, re, the rate of change of phase of the evoked potential. If one knows the time delay (latency) of the potential being sought, one can correct for this delay and reduce the phase slope. However, if we wish to use coherence estimates to detect evoked potentials at or near threshold, we cannot know

precisely how much delay to correct for, for a given individual. Post hoc frequency smoothing is a simpler alternative. Although coherence variance (and critical value) is not reduced as much as for conventional frequency smoothing, no assumptions are required regarding phase rate of change. Furthermore, critical values are easy to estimate based on the number of subaverages (q) and the number of adjacent frequencies (m) for which coherence values are averaged (eq. 7). With post hoc frequency smoothing, we need only ensure that the degree of smoothing does not obscure the spectral details of the coherence function. The authors wish to express their thanks to Brandon Warren and Chris Prall, scientific programmers, for software development, and to Rita Willits for clerical assistance.

References American National Standards Institute. American National Specifications for Audiometers, $3.6-1969. ANSI, New York, 1970. Amos, D.E. and Koopmans, L.H. Tables of the Distribution of the Coefficient of Coherence for Stationary Bivariate Gaussian Processes. Monograph SCR-483. Sandia Corp., Albuquerque, NM, 1963. Bendat, J.S. and Piersol, A.G. Random Data. Analysis and Measurement Procedures, 2nd Edn. Wiley, New York, 1986. Brillinger, D.R. A note on the estimation of evoked response. Biol. Cybernet., 1978, 31: 141-144. Carter, C.C., Knapp, C.H. and Nuttall, A.H. Estimation of the magnitude-squared coherence function via overlapped fast Fourier transform processing. IEEE Trans. Audio. Electroacoust., 1973, 21: 337-344. DoNe, R.A. and Wilson, MJ. Short-latency auditory responses obtained by cross correlation. J. Acoust. Soc. Am., 1984, 76: 14111421. Dobie, R.A. and Wilson, M.J. Analysis of auditory evoked potentials by magnitude-squared coherence. Ear Hear., 1989, 10: 2-13. Tucci, D.L., Wilson, M.J. and Dobie, R.A. Coherence analysis of scalp responses to amplitude-modulated tones. Acta Otolaryngol. (Stockh.) 1990, 109: 195-201. Wilson, M.J. and DoNe, R.A. Human short-latency auditory responses obtained by cross-correlation. Electroenceph. clin. Neurophysiol., 1987, 66: 529-538.