A system to detect the onset of epileptic seizures in scalp EEG

A system to detect the onset of epileptic seizures in scalp EEG

Clinical Neurophysiology 116 (2005) 427–442 www.elsevier.com/locate/clinph A system to detect the onset of epileptic seizures in scalp EEG M.E. Saab,...

702KB Sizes 0 Downloads 29 Views

Clinical Neurophysiology 116 (2005) 427–442 www.elsevier.com/locate/clinph

A system to detect the onset of epileptic seizures in scalp EEG M.E. Saab, J. Gotman* Department of Neurology and Neurosurgery, Montreal Neurological Institute, McGill University, 3801 University Street, Montreal, Que., Canada H3A 2B4 Accepted 4 August 2004 Available online 18 September 2004

Abstract Objective: A new method for automatic seizure detection and onset warning is proposed. The system is based on determining the seizure probability of a section of EEG. Operation features a user-tuneable threshold to exploit the trade-off between sensitivity and detection delay and an acceptable false detection rate. Methods: The system was designed using 652 h of scalp EEG, including 126 seizures in 28 patients. Wavelet decomposition, feature extraction and data segmentation were employed to compute the a priori probabilities required for the Bayesian formulation used in training, testing and operation. Results: Results based on the analysis of separate testing data (360 h of scalp EEG, including 69 seizures in 16 patients) initially show a sensitivity of 77.9%, a false detection rate of 0.86/h and a median detection delay of 9.8 s. Results after use of the tuning mechanism show a sensitivity of 76.0%, a false detection rate of 0.34/h and a median detection delay of 10 s. Missed seizures are characterized mainly by subtle or focal activity, mixed frequencies, short duration or some combination of these traits. False detections are mainly caused by short bursts of rhythmic activity, rapid eye blinking and EMG artifact caused by chewing. Evaluation of the traditional seizure detection method of [Gotman J. Electroencephalogr Clin Neurophysiol 1990;76:317–24] using both data sets shows a sensitivity of 50.1%, a false detection rate of 0.5/h and a median detection delay of 14.3 s. Conclusions: The system performed well enough to be considered for use within a clinical setting. In patients having an unacceptable level of false detection, the tuning mechanism provided an important reduction in false detections with minimal loss of detection sensitivity and detection delay. Significance: During prolonged EEG monitoring of epileptic patients, the continuous recording may be marked where seizures are likely to have taken place. Several methods of automatic seizure detection exist, but few can operate as an on-line seizure alert system. We propose a seizure detection system that can alert medical staff to the onset of a seizure and hence improve clinical diagnosis. q 2004 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved. Keywords: EEG; Epilepsy; Seizure detection; Seizure onset

1. Introduction Since the first general-purpose detection methods were introduced in the 1980s, automatic seizure detection has become increasingly important in the long-term monitoring process. While still not sufficient as the sole indicator of patient seizures, seizure detection technology in combination with the push-button system and video review supply

* Corresponding author. Tel.: C1 514 398 1953; fax: C1 514 398 8106. E-mail address: [email protected] (J. Gotman).

an efficient means of capturing and recording a large fraction of epileptic seizures. The challenge in designing a clinically useful automatic seizure detector is a significant one. In essence, the design of a method to detect epileptic seizures can be compared to the design of a pattern recognition algorithm for which there is no standard, pre-defined pattern. Seizures are manifested in the EEG by a tremendous array of characteristics and for a given patient there is no a priori certainty which characteristics may or may not appear. To ensure the maximum performance of a clinical seizure detector, Gotman et al. (1997) suggest

1388-2457/$30.00 q 2004 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.clinph.2004.08.004

428

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

the following criteria for the selection of data used in the design: † Two data sets should be used, one for training and the other for testing; † Large sample sizes should be obtained; † Data should in no way be pre-selected; † Data from a number of sources should be employed. It is also advantageous for the method to be designed to seek out very general characteristics as opposed to predefined seizure morphologies. In general, the EEG of most seizures have frequencies between 3 and 29 Hz (Gotman, 1982), and involve rhythmic discharges and/or increased amplitude. The present use of these guidelines is enough to characterize and successfully detect 70–90% of clinical seizures with false detection rates ranging from 0.3 to 3/h (Gabor, 1998; Gotman, 1982; 1990; Khan and Gotman, 2003; Pauri et al., 1992; Salinsky, 1997). The seizure detection method proposed here is based on the probability that a portion of EEG contains seizure activity. It is designed specifically for use with scalp EEG. This is necessary to account for the many differences between this type of recording, which contains many different types of artifacts, and intracranial EEG, which is free of these artifacts, but contains seizures of greater variety and higher frequency. As far as we know, this is a new approach that we hope will eventually lead to a more versatile clinical seizure detection system. The method is also designed with an additional application in mind. As well as helping in the long-term monitoring of patients by automatically marking the EEG recording, the system is intended to serve as an on-line warning mechanism to alert medical staff, and possibly the patients, that an epileptic seizure has begun. 1.1. Detection of seizure onset Medical staff intervention during seizure activity is an integral part of the evaluation of an epileptic patient. Information that can help localize an epileptic focus can be ascertained by talking and interacting with patients while they are having a seizure. The patient’s awareness and their ability to respond, to retain and repeat information, or to make certain movements, for example, can greatly help in determining which brain functions, and hence which areas of the brain are affected by the seizure. This intervention often comes when it is too late and the patient is unresponsive. An automatic seizure detection system designed to alert medical staff to the onset of seizures allows the dialogue to take place while the patient is still alert and while the seizure may still be localized. It acts the same way as a bedside push-button used in longterm monitoring (alerting medical staff while at the same time marking the EEG recording), but potentially more effectively. The system relies on EEG information rather

than the patient’s ability to sense a seizure and push the alert button, which many patients cannot do. This is also beneficial when considering purely electrographic seizures which exhibit no apparent behavioral manifestations. According to Ives and Woods (1980), these represent 30% of seizures. Patients with this type of seizure are rarely, if ever, tested during the ictal state simply because patient and observer alike are unaware that a seizure is taking place. Although brain function might be impaired during such a seizure, this is most often unexplored. In such a system, false detection rate is an important statistic. When the detector marks the location of seizure activity in the EEG, inaccurate markings are simply discarded by the reviewer. In an onset detection system designed to alert medical staff, if the false detection rate were too high, the alerting mechanism would likely be switched off by medical staff frustrated by its inaccuracy.

2. Method The proposed system of automatic seizure onset detection is designed to be used on-line within a long-term monitoring facility. It employs a Bayesian formulation to output a variable based on the probability that a section of EEG contains seizure activity. It also features tuneable operation, such that a user may benefit from the trade-off between sensitivity and false detection rate without altering any of the internal components of the system. Finally, the system is designed to attempt to detect seizures as close to their onset as possible. 2.1. Data selection All the surface EEG data used in this study were collected from the Epilepsy Telemetry Unit at the Montreal Neurological Institute and Hospital, using the Stellate Harmonie system for EEG monitoring (Stellate, Montreal, Canada). Data were sampled at 200 Hz after filtering between 0.5 and 70 Hz and a bipolar electrode montage of either 24 or 32 channels was used in the analysis. Patients were chosen based solely on the criterion that they experienced at least two seizures while undergoing longterm monitoring. This ensured that pre-selection of specific seizure patterns was avoided. Clinical and sub-clinical seizures were both included (seizures were not selected on the basis of clinical manifestations). A 4–6 h recording was kept around each seizure and included any additional seizures which happened to have occurred within that time. Separate non-seizure recordings were also kept for each patient. These consisted of a 4 h seizure-free recording made while the patient was asleep, and a separate 4 h seizure-free recording made while the patient was awake. Thus, the total data kept for each patient were at least two

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

4–6 h recordings containing seizures and two additional 4 h seizure-free recordings. Training data were collected between December, 2000 and September, 2002. No pre-selection of patient seizures was done except in the case of two patients whose seizures were not accompanied by any visible EEG alteration. These seizures were rejected because they would serve no purpose in training the automatic seizure detector. This left 28 patients (of the original 30) having a total of 126 seizures during 652 h of recorded surface EEG. Testing data were collected between November, 2003 and February, 2004 with no pre-selection of seizure patterns. Data from 16 consecutive patients were obtained, including 69 seizures during 360 h of EEG. An additional patient was excluded from the testing data statistics because the nature of the seizures made it impossible to characterize detections and estimate detection delay in a congruent way. The seizures in this patient were extremely long (some over an hour) and often no clear distinction between seizure and non-seizure sections existed making it difficult to demarcate seizure onsets. The performance of the system on this patient’s EEG will be reported, but it will be treated as a special case. 2.2. Signal processing and feature extraction The 5-level wavelet transform, using a Daubechies-4 wavelet, is performed separately on each 2 s epoch of data in each channel (refer to Burrus et al., 1998; Daubechies, 1992 for more information regarding wavelet transforms). Frequency bands of 50–100, 25–50, 12–25, 6–12, and 3–6 Hz are created (corresponding to decomposition scales 1–5) as well the 0–3 Hz band which is not used. This band is discarded because occurrences of 0–3 Hz activity in nonictal sleep EEG can be frequent. Also, seizures with activity in the 2–3 Hz range are usually not limited to that range throughout the course of the seizure. Seizure activity is characterized by scales 3, 4 and 5 since it is most often between 3 and 29 Hz (Gotman, 1982). The information in scales 1 and 2 is used to estimate the amount of EMG artifact present in the EEG and the information in scale 4 is used to characterize alpha activity. This will be explained below. The 3 characterizing measures for the EEG are derived directly from the wavelet coefficients. These features were taken from the detection algorithm designed by Khan and Gotman (2003), namely relative average amplitude, relative scale energy and coefficient of variation of amplitude. The relative average amplitude is the ratio of the mean of peak-to-peak amplitudes in the current epoch to the mean of peak-to-peak amplitudes in what is termed the background. The background is defined as a 30 s block of data ending 1 min earlier than the current epoch. This allows for the amplitudes of the first minute of potential seizure EEG to be compared to what is almost certainly non-seizure EEG. Updating the background for each new epoch rather than using a fixed background, for example, from the beginning

429

of the recording, ensures that it represents the current state of the EEG at all times. The waveforms are first decomposed using the segment decomposition method of Gotman and Gloor (1976). The resulting segments, defined as single line connections between two local extrema in the waveform, are used to calculate the peak-to-peak amplitudes needed for this feature. Temporal continuity is maintained in the segment decomposition by using the last segment peak of an epoch as the first segment peak of the following epoch. Relative scale energy is defined as the ratio of the energy in the coefficients of a given scale to the energy of the coefficients in all scales. In other words, the relative energy of a given wavelet scale is essentially the proportion of signal energy contained in that scale. This feature is directly calculated from the wavelet coefficients and not from the amplitudes of the segment decomposition. It serves as a measure of rhythmicity as a sustained elevated value in one scale indicates a somewhat constant frequency in the signal. The coefficient of variation of amplitude is defined as the square of the ratio of the standard deviation to the mean of the peak-to-peak amplitudes. The waveform decomposition method is used in this calculation as well. This feature serves as a measure of variability of the signal amplitude. A low value indicates little variation. The use of relative measures for all features serves to remove any dependence on inter-patient and inter-system variability. Absolute EEG amplitudes are influenced by several variables such as the electrodes used, the data acquisition and amplification system, and even the patient himself. By using relative measures, the parameters of the system can be held constant for all patients within all recording systems. 2.3. Bayes’ formula Bayes’ formula is used to estimate the conditional probability of a certain outcome given an observed event, based on experimental results. Applied to the problem of seizure detection, ‘outcome’ refers to whether we are seeing seizure activity or not and ‘observed event’ refers to the measured features. Thus Bayes’ formula in this case is given by PðseizurejfeaturesÞ Z

PðfeaturesjseizureÞPðseizureÞ PðfeaturesÞ

(1)

and Pðnon  seizurejfeaturesÞ Z

Pðfeaturesjnon  seizureÞPðnon  seizureÞ PðfeaturesÞ

(2)

The terms in Eq. (1) are described by the following (similarly for the non-seizure case described by

430

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

Eq. (2)): 1. P(seizurejfeatures): the probability that a particular set of measured features describe seizure activity. 2. P(featuresjseizure): the probability that seizure activity is described by a particular set of measured features (based on analysis of the training data). 3. P(seizure): the probability that a seizure is found in the EEG (based on analysis of the training data). 4. P(features): the probability that a particular set of measured features arise in the EEG (based on analysis of the training data). Put simply, Bayes’ formula allows us to describe the behavior of an experiment based on how it has behaved in the past. In our system, the a priori probabilities (right-hand side of Eq. (1)) were derived using the training data and the a posteriori probabilities (left-hand side of Eq. (1)) serve as the indicator of seizure activity in the EEG. 2.4. Training The a priori probabilities were found by observing the system and noting the outcome–event relationships that arose in the training data. The EEG recordings were first separated into seizure and non-seizure sections. Seizure onset was defined as the first EEG change that led to a clear seizure discharge without returning to background, independently of the form of this change. The data from seizure and non-seizure sections were stored separately and histograms were built for each feature. For each feature in each wavelet scale, the seizure data was divided evenly into 5 bins so that each contained the same number of data points. Every epoch was characterized jointly by 3 feature values such that the epoch could fall within one of 125 bins

(one of 5 for relative amplitude, one of 5 for relative energy and one of 5 for coefficient of variation). The ratio of these bin counts to the total number of epochs in the seizure sections, in other words the observed proportion of occurrences, served as a priori probabilities required for term 2 of the Bayesian formulation. The same operation was performed for non-seizure data, however, the distributions were divided into the same 5 sections derived from the seizure data. Seizure data were found to include a broader range of feature values than non-seizure data. Using the bin boundaries derived from the seizure data ensured that the bin counts adequately reflected the differences in the distributions of seizure and non-seizure data. This procedure is outlined graphically in Fig. 1. Examples of data distributions for all 3 features in both seizure and nonseizure data are shown for a given scale. An example of a calculation of a priori term 2 is shown. The equation states that the probability of finding an epoch whose 3 feature values fall into that particular combination of bins in seizure data is given by the number of epochs with that combination of bins divided by the total number of seizure epochs. A similar calculation was made for all 125 combinations of bins, for both seizure and non-seizure data. This accounted for every epoch that was processed. It is important to note that certain bins (the high ranges for amplitude and energy and the low range for coefficient of variation) are filled with more seizure data than non-seizure data. This is what allowed higher probability values to be derived for epochs which were characterized by seizure-like activity. In Eq. (1), term 3 of Bayes’ formula was estimated using the ratio of the number of seizure epochs to the total number of epochs. In Eq. (2), term 3 was given by the ratio of the number of non-seizure epochs to the total number of epochs. Term 4, which is the same in Eqs. (1) and (2), was found

Fig. 1. Graphic representation of the data distribution analysis. Shown is an example of an a priori probability calculation for the features of seizure epochs that fall into bins E, D and A, respectively.

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

431

Information from scale 3 (12–25 Hz) is omitted to avoid any overlap between seizure and EMG.

using the following equation: PðfeaturesÞ Z PðfeaturesjseizureÞPðseizureÞ C Pðfeaturesjnon  seizureÞPðnon  seizureÞ (3) 2.5. Artifacts 2.5.1. Alpha activity Although alpha activity is not an EEG artifact, it is discussed under this heading because it is a frequent cause of false detections, the other major causes of which are artifacts. The onset of alpha activity can produce seizure probabilities very similar to those produced by seizure activity and hence cause false detections. This is due to the overlap in the frequency ranges (3–25 Hz for seizure and 8–13 Hz for alpha) as well as the fact that some seizure morphologies resemble alpha activity. To remedy this, alpha activity was characterized using the same process of data distributions and Bayesian formulation used in the characterization of seizure data. The features from approximately 56 min of alpha activity (EEG containing alpha activity free of EMG, EOG and any other artifacts) originating in 6 of the 28 patients as well as from a total of 24 h of non-alpha activity (EEG containing anything regularly found in EEG, excluding alpha activity and seizures) from 13 of the 28 patients were stored and analyzed. Only features from scale 4 (6–12 Hz) were used. The result was to derive the probability that measured features described alpha activity. This probability accompanies the seizure probability in the final decision process. 2.5.2. EMG The EMG signal caused by contractions of the scalp and neck muscles interferes greatly with the EEG. According to O’Donnell et al. (1974), less than a third of the total signal power falls within the 3–25 Hz range, where seizure activity is found and ‘muscle activity which occurs with contraction is not directly reflected in the EEG activity below 14 Hz (O’Donnell et al., 1974). For these reasons, the presence of EMG is taken into account by scaling the seizure probability by a factor related to the amplitude of the activity present in scales 1 and 2 (25–100 Hz), as explained below.

2.5.3. Electrode failure Artifacts are introduced when electrodes become loosened or disconnected. During long-term monitoring, patients can be eating, sleeping, frequently moving, etc. and technologists are not always present to rectify electrode problems as they arise. The system is designed to ignore epochs which contain one of 3 possible symptoms of electrode failure. The first is abnormally high signal amplitude (greater than 1000 mV) caused by movement of the electrode or amplifier disconnection and reconnection. The second is 60 Hz activity above 20 mV (measured by spectral analysis). The third is an effect characterized by a phase reversal in channels containing the same loose electrode (Fig. 2). The channels are added and the mean amplitude of their sum is compared to the mean amplitude of the first of the two channels. The epoch is rejected if the mean amplitude of the sum is less than half of the mean amplitude of the first channel, signifying that the two original channels are of similar amplitude and of opposite polarity. All of the above parameters were optimized during training. They were chosen to maximize electrode artifact rejection while ensuring that they caused no seizures to be missed in the training data. 2.6. Detection variables In a single channel, the a posteriori probability found for scale i is labeled P SEZ_i. The channel probability PSEZ_CHANNEL is found by summing PSEZ_i for scales 3–5. The 6 channels with the highest probabilities are summed to give PSEZ_EPOCH. Limiting the number of channels that contribute to the epoch probability to 6 serves to provide a large number of seizures with an equal chance of being detected. For example, the sum of all channel probabilities in a generalized seizure would be much greater than that of a partial seizure occurring in only a subset of the channels. In contrast, the sum of the high 6 channels in each would likely be comparable, as long as at least 6 channels were sufficiently involved in the partial seizure. The probability of alpha activity is calculated in a similar manner. In this case, the only a posteriori probability to

Fig. 2. Example of a technical artifact caused by the malfunction of electrode T6. A sustained phase reversal is seen between channels T4-T6 and T6-O2.

432

Table 1 Comparative results of the two methods in the training data set with ideal threshold settings Training data Patient

Total

Seizures

Hours

5 6 3 4 5 3 3 3 4 3 3 3 4 5 5 4 3 13 4 4 3 5 3 5 4 4 4 11

24 30 26 21 20 20 25 27 23 20 22 20 27 24 30 22 22 10 23 23 23 24 29 23 24 24 28 20

126

652

PTH 4 4 4 4 4 4 4 5 4 5 4 4 5 4 4 4 4 4 5 5 6 4 4 4 5 4 5 4

TD 1 1 2 3 3 3 3 3 3 3 3 3 4 2 4 4 3 11 4 4 3 2 3 5 4 4 2 11 101

% 20 17 67 75 60 100 100 100 75 100 100 100 100 40 80 100 100 85 100 100 100 40 100 100 100 100 50 100 m 82

Traditional method UFD 4 8 2 1 2 1 2 2 2 4 5 5 7 7 7 5 5 0 5 3 5 4 0 4 6 3 3 5 107

UFD rate 0.16 0.27 0.08 0.05 0.10 0.05 0.08 0.07 0.09 0.20 0.23 0.25 0.26 0.29 0.24 0.22 0.23 0.00 0.22 0.13 0.22 0.17 0.00 0.18 0.26 0.13 0.11 0.25 m 0.16

IFD 0 0 0 0 57 10 0 0 0 0 0 1 0 2 0 0 0 34 0 0 0 0 0 0 9 0 1 0 114

IFD rate

Tot rate

Delay (s)

TD

%

0.00 0.00 0.00 0.00 2.85 0.50 0.00 0.00 0.00 0.00 0.00 0.05 0.00 0.08 0.00 0.00 0.00 3.58 0.00 0.00 0.00 0.00 0.00 0.00 0.38 0.00 0.04 0.00 m 0.27

0.16 0.27 0.08 0.05 2.95 0.55 0.08 0.07 0.09 0.20 0.23 0.30 0.26 0.38 0.24 0.22 0.23 3.58 0.22 0.13 0.22 0.17 0.00 0.18 0.64 0.13 0.14 0.25 m 0.43

7.0 10.0 20.0 15.3 7.7 9.7 6.7 9.7 13.3 29.3 5.3 9.7 5.0 5.5 7.0 6.0 9.0 5.2 6.3 7.3 19.7 31.0 4.3 13.4 8.5 24.5 15.0 9.2 Median 9.1

1 0 1 3 3 3 3 3 0 2 2 3 4 0 3 3 3 13 3 3 3 0 3 2 4 3 2 5

20 0 33 75 60 100 100 100 0 67 67 100 100 0 60 75 100 100 75 75 100 0 100 40 100 75 50 45 m 65

TD, true detections; %, detection sensitivities; UFD, uninteresting false detections; IFD, interesting false detections.

78

FD 1 0 1 0 14 3 0 14 0 5 9 1 3 12 2 6 12 11 3 22 8 2 1 0 54 9 13 4 210

FD rate

Delay (s)

0.04 0.00 0.04 0.00 0.70 0.15 0.00 0.51 0.00 0.25 0.41 0.05 0.11 0.50 0.07 0.27 0.55 1.16 0.13 0.96 0.35 0.08 0.03 0.00 2.30 0.38 0.47 0.20 m 0.35

15.0 n/a 16.0 15.0 12.3 24.3 10.3 11.0 n/a 35.5 12.5 17.0 13.3 n/a 12.0 14.3 14.3 10.4 13.7 14.3 24.7 n/a 14.0 13.0 11.0 11.0 10.5 10.4 Median 13.5

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28

Onset method (after tuning)

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

consider is the one from scale 4 (6–12 Hz). Channels involving occipital and parietal electrodes are designated as ‘potential alpha channels’. The epoch alpha probability PALP_EPOCH is found by summing the 6 of the potential alpha channels with the highest probabilities. The choice of 6 channels was made based on the spatial distribution of the partial seizures seen in the training data. Using more channels did not help to detect the partial seizures that were missed and using fewer channels reduced the separation between the probabilities of seizure and nonseizure activity. To reduce the likelihood of false detections caused by EMG activity, the seizure probability is scaled by a factor based on the amplitudes of the waveform segments in scales 1 and 2. The scaling operation is performed as follows PSEZ – SCALED Z PSEZ – EPOCH ð1 K EMGAmpRatioÞ

(4)

where EMGAmpRatio Z

sum of amplitudes in scales 1 and 2 sum of amplitudes in scales 1; 2; 4 and 5

(5)

Temporal context is introduced by summing PSEZ_SCALED over 3 epochs. This is done to avoid the detection of short bursts of rhythmic activity without eliminating the possibility of detecting a high probability event within 2 s. The result is the final detection variable PSEZ. Similar temporal context was used to create PALP from PALP_EPOCH. The two detection variables, PSEZ and PALP, are compared to thresholds PTH and ATH, respectively, to determine the detection status of each epoch. The threshold values (PTHZ4 and ATHZ2) were chosen to maximize performance in the training data considering the trade-off between sensitivity and false detections. All epochs with PSEZ!PTH are ignored. An epoch with PSEZOPTH is assigned the status of detection unless PALPOATH is true, in which case it is assigned the status of rejection. A seizure is considered to be detected if at least one epoch is assigned the status of detection. The system rejects a large proportion of alpha bursts and certain seizures that resemble alpha activity can also be rejected. To remedy this, the spatial distribution of the top 6 channels chosen for PSEZ_EPOCH is considered. If all 6 are located in the same hemisphere, seizure detection is given priority over alpha detection. If at least one channel is found from each hemisphere, alpha rejection is given priority. The rationale is that alpha activity is more likely to be bilaterally distributed than are seizures. 2.7. User tuneability Because the system is not patient-specific, certain patients have higher false detection rates than others. To provide the user with a means of reducing the high false detection rates in these patients, PTH is designed to be

433

tuneable between values of 4, 5, 6 and 7. This allows the user to make the choice of potentially sacrificing seizure detections (by increasing PTH) to reduce high false detection rates in certain patients. During training, this functionality was verified by increasing PTH for all recordings in patients with false detection rates above 0.3/h. The reason for this was to reproduce the motivation (many false detections) and the effect (a higher threshold for all recordings) of tuning in a clinical setting as opposed to only tuning certain recordings to produce the best overall results.

3. Results The results from both the training and the testing of the method are presented in this section. Training results were derived recursively as the method was being designed. Testing results were derived after a single execution of the method using data that were not used in the training, as described above. Global statistics are presented as averages of statistics from each patient to avoid any biasing by individual patients having many seizures. These are average sensitivity, average false detection rate and median detection delay. Detections separated by less than 30 s were grouped and counted as a single detection. Detection delay was defined as the time elapsed between the onset of a seizure and the end of the first detected epoch. As mentioned above, seizure onsets were defined using electrographic criteria. False detections were defined as any non-seizure events that were detected by the system. They were labeled interesting and uninteresting. Uninteresting false detections included alpha activity, EMG, eye movement, etc. Interesting false detections were epileptic events, such as spike and wave complexes or other events related to a patient’s epileptic disorder that might be of interest to a physician. Although at times these detections may be useful, they were categorized as false because the system was not specifically designed to recognize them and there was no sensitivity measure associated with their detection. The operation of the tuning mechanism was applied in the training and the testing. The guidelines described above were followed to emulate a realistic scenario in a clinical setting. Tables 1 and 2 show the individual patient results of the complete training and testing data sets, using the ideal settings for PTH for each patient (this is the lowest PTH setting for which the individual false detection rate was below 0.3/h). Tables 3 and 4 show the initial results for the patients in the training and testing sets, for whom tuning was required (i.e. patients for whom the initial false detection rate with PTHZ4 was greater than 0.3/h). A summary of these results is presented in Table 5. In the training data, the initial average sensitivity with PTHZ4 was 84.2% with an uninteresting false detection rate of 0.27/h, an interesting false detection rate of 0.28/h

434

Testing data

Onset method (after tuning)

Patient

Seizures

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

3 2 5 6 4 8 2 3 2 17 2 2 3 4 2 4

Hours 25 22 19 29 27 22 16 24 17 18 19 20 25 28 18 34

Total

69

360

PTH 4 6 4 4 4 7 4 4 7 6 6 6 6 5 4 5

TD 3 2 0 1 4 2 2 3 2 4 2 2 3 4 2 2 38

Traditional method

%

UFD

UFD rate

IFD

IFD rate

Tot rate

Delay (s)

100 100 0 17 100 25 100 100 100 24 100 100 100 100 100 50 m 76

4 2 2 8 6 2 3 4 4 4 3 2 1 2 5 10

0.16 0.09 0.10 0.28 0.23 0.09 0.19 0.17 0.24 0.22 0.16 0.10 0.04 0.07 0.29 0.30 m 0.17

0 7 0 62 0 0 1 0 4 0 0 0 0 0 0 0

0.00 0.33 0.00 2.14 0.00 0.00 0.06 0.00 0.24 0.00 0.00 0.00 0.00 0.00 0.00 0.00 m 0.17

0.16 0.42 0.10 2.41 0.23 0.09 0.26 0.17 0.47 0.22 0.16 0.10 0.04 0.07 0.29 0.30 m 0.34

11.3 6.5 n/a 7.0 9.8 35.0 31.5 10.0 20.0 18.0 44.0 6.0 7.3 7.0 2.0 13.5 Median 10.0

62

74

TD 3 0 0 1 1 2 2 0 0 0 1 1 3 4 2 1 21

% 100 0 0 17 25 25 100 0 0 0 50 50 100 100 100 25 m 43

FD 16 18 8 31 0 1 11 6 23 53 41 4 19 5 14 4 254

FD rate

Delay (s)

0.64 0.84 0.41 1.07 0.00 0.04 0.71 0.26 1.35 2.94 2.16 0.21 0.76 0.18 0.80 0.12 m 0.78

15.0 n/a n/a 24.0 19.0 16.0 24.5 n/a n/a n/a 11.0 10.0 17.3 13 14.5 18.0 Median 16.0

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

Table 2 Comparative results of the two methods in the testing data set with ideal threshold settings

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

435

Table 3 Initial results before tuning in patients from the training set with uninteresting false detection rates above 0.3/h Patient

PTH

TD

%

UFD

UFD rate

IFD

IFD rate

Tot rate

Delay (s)

8 10 13 19 20 21 25 27

4 4 4 4 4 4 4 4

3 3 4 4 4 3 4 4

100 100 100 100 100 100 100 100

12 10 22 16 8 21 11 12

0.44 0.49 0.83 0.69 0.35 0.92 0.47 0.43

0 0 0 0 0 1 14 1

0.00 0.00 0.00 0.00 0.00 0.04 0.60 0.04

0.44 0.49 0.83 0.69 0.35 0.97 1.06 0.47

9.7 24.7 4.5 6.3 6.8 17.0 6.5 15.8

Table 4 Initial results before tuning in patients from the testing set with uninteresting false detection rates above 0.3/h Patient

PTH

TD

%

UFD

UFD rate

IFD

IFD rate

Tot rate

Delay (s)

2 6 9 10 11 12 13 14 16

4 4 4 4 4 4 4 4 4

2 2 2 5 2 2 3 4 3

100 25 100 29 100 100 100 100 75

26 17 28 9 9 21 19 9 22

1.21 0.76 1.65 0.50 0.47 1.08 0.76 0.33 0.65

31 0 21 0 0 0 0 0 0

1.44 0.00 1.24 0.00 0.00 0.00 0.00 0.00 0.00

2.65 0.76 2.89 0.50 0.47 1.08 0.76 0.33 0.65

5.5 15.0 15.0 17.0 40.0 4.0 6.0 5.0 9.0

(combined rate of 0.55/h) and a median detection delay of 9.0 s. The detection threshold PTH was increased to 5 for 7 patients and to 6 for one patient such that the individual uninteresting false detection rates in these patients was lowered to below 0.3/h. This had the effect of lowering the overall average sensitivity to 82.4% and increasing the median delay time very slightly to 9.1 s. The average uninteresting false detection rate was lowered to 0.16/h for a combined rate of 0.43/h. In the testing data, the initial average sensitivity with PTHZ4 was 77.9% with an uninteresting false detection rate of 0.55/h, an interesting false detection rate of 0.31/h (combined rate of 0.86/h) and a median detection delay of 9.8 s. The detection threshold PTH was increased to 5 for two patients, to 6 for 5 patients and to 7 for two patients such that the individual uninteresting false detection rates in these patients was lowered to below 0.3/h. This had the effect of lowering the overall average sensitivity to 76.0% and increasing the median delay time to 10.0 s. The average uninteresting false detection rate was lowered to 0.17/h for a combined rate of 0.34/h.

The traditional method of Gotman (1982, 1990) was also evaluated using both data sets and the overall performance was an average sensitivity of 50.1%, a combined false detection rate of 0.5/h and a median detection delay time of 14.3 s (combined results from Tables 1 and 2). The majority of seizures that were missed with PTHZ4 in the training data were short seizures with subtle activity. Initially, 23 seizures were missed, 9 of which came from two patients. Of these 9, 4 were from patient 1, occurring in very few channels and lasting less than 5 s (Fig. 3); 5 were from patient 2, characterized by low amplitude activity occurring in few channels and lasting less than 8 s. Ten other missed seizures, from 6 patients, were all characterized by low amplitude, subtle changes. Four of these were less than 10 s in duration (Fig. 4). Two other missed seizures had clear rhythmic activity, but were not detected as the associated PSEZ values were slightly under 4 and the final two missed seizures were rejected based on the spatial distribution criterion (i.e. PSEZ exceeded PTH, while at the same time PALP exceeded ATH and the activity was bilaterally distributed). Eighteen of 28 patients in

Table 5 Summary of training and testing results using ideal threshold settings

Training Testing

Average sensitivity (%)

Patients with 100% sensitivity

Average uninteresting FD rate (/h)

Average interesting FD rate (/h)

% Of FD labeled interesting

Median delay (s)

82.4 76.0

17/28 11/16

0.16 0.17

0.27 0.17

51.6 54.4

9.1 10.0

436

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

Fig. 3. Missed seizure from patient 1 of the training data. Seizure activity is clear and rhythmic, but the duration of the seizure is less than 5 s and it only appears in a few channels.

the training set exhibited 100% detection sensitivity with PTHZ4. In the testing data, 29 seizures were missed with PTHZ4, 28 of which came from 4 patients. Of these 28, 12 were from patient 10 and were characterized by intermittent activity of mixed frequencies (Fig. 5). Five were from patient 3, all of which were focal temporal seizures with significant EMG activity in the background. This patient was the only one in the combined data set for whom no seizures were detected. Five more of the 28 were from patient 4, 3 of which were characterized by bursts of spike and wave activity, which did not exhibit sustained rhythmicity and another was subtle with low amplitude changes. The final 6 were from patient 6, all characterized by clear, sustained rhythmic activity, but only occurring in two or 3 channels (Fig. 6). The only other missed seizure, from patient 16, exhibited very subtle changes in few channels. The remaining 11 testing patients exhibited 100% detection sensitivity with PTHZ4. Some examples of detected seizures are shown in Figs. 7–10. The seizures shown in Figs. 7 and 8 are clear generalized seizures with sustained, large amplitude, rhythmic activity, which are in general fairly easy to detect. The system did not miss any of these types of seizures in either data set. The seizures shown in Figs. 9 and 10 are examples of detected seizures with subtle changes in few channels. Seizures like these pose a more difficult challenge as discussed above. The system detected 15 out of 47 of these types of seizures in 9 patients in the combined data set (average sensitivity of 32.6%). When increasing PTH due to uninteresting false detection rates above 0.3/h, only 4 seizures were missed, two in the training set and two in the testing set. The two in the training

set were from the same patient, whose recordings were plagued by an unusual technical artifact responsible for the high false detection rate, likely caused by a neighboring piece of equipment acting as a source of electromagnetic interference. One of the seizures missed in the testing set came from a patient whose recording included many artifacts caused by chewing, while the other came from a different patient whose recording included many bursts of rapid eye blinks. Interesting false detections were caused mostly by epileptic events such as bursts of spike and wave or sharp wave activity (Fig. 11). These were the most frequent false detections in 3 of the 28 training patients (100, 97 and 56% of the patients’ individual false detections) and two of 9 testing patients (54 and 89%). One other patient in the testing set had a significant proportion of false detections (43%) that were interesting. These were due to short bursts of high-amplitude slow waves. The interesting false detections in these 6 patients accounted for 88% of their total false detections. Interesting false detections in all patients accounted for 52.7% of the total number of all false detections. In the patients with uninteresting false detection rates above 0.3/h, detections were caused mostly by short bursts of rhythmic activity (Fig. 12), rapid eye blinking (Fig. 13) and chewing (Fig. 14). These were the most frequent uninteresting false detections in 6 of 7 training patients and 8 of 9 testing patients whose detection threshold PTH was increased. The most frequent uninteresting false detections occurring in the remaining two patients (one from each data set) were due to a partial disconnection of the electrode set from the amplifier. This type of disconnection did not fully disable signal capture and what was recorded was low

Fig. 4. Missed seizure from patient 14 of the training data. A subtle change occurs in few channels (mostly central channels as well as in P4-O2) and the duration of the seizure is less than 8 s.

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

437

Fig. 5. Missed seizure from patient 10 of the testing data. Activity is intermittent and of mixed frequencies.

amplitude, full-spectrum noise with intermittent bursts of higher amplitude noise most likely due to movement of the amplifier connector. In the patient that was excluded from the overall results for reasons outlined above, the system detected 5 of 7 seizures with a combined false detection rate of 0.4/h. 4. Discussion The first method of seizure detection to be widely used in a clinical setting was that of Gotman (1982), later modified and subject to an extensive evaluation (Gotman, 1990). It was also subject to independent evaluations by Pauri et al. (1992) and Salinsky (1997). The combined results of these evaluations were a sensitivity of 70–80% and a false detection rate of 1–3/h. The above evaluations were performed exclusively on scalp recordings except for that of Gotman (1990), which included 5 patients with intracranial electrodes. Other methods have been designed for scalp recordings using artificial neural networks to differentiate between seizure and non-seizure activity. The most notable of these is the system of Gabor et al. (1996). An extensive evaluation was performed by Gabor’s group using 4500 h of scalp EEG recorded from 65 patients (Gabor, 1998). They reported a sensitivity of 92.8% with an average false detection rate of 1.35/h. Only seizures occurring in the frontal and temporal lobes accompanied by clinical manifestations were included in the study. Other methods have been designed specifically for use with intracranial recordings. The method of Harding (1993)

was designed to detect both the spiking phase and the suppression of seizures in depth EEG. Testing of 40 patients having a total of 416 seizures over 1578 h produced a sensitivity of 96.9% to clinical seizures and 92.6% to purely electrographic seizures with an overall false detection rate of 1.94/h. Only temporal lobe seizures were considered and parameters were adjusted after the first seizure was detected. Osorio et al. (1998) tested their system using 125 10 min sections of depth EEG containing only mesial temporal seizures as well as 205 10 min sections of non-seizure depth EEG. Testing on the same short seizure sections used to design the method produced, not surprisingly, a sensitivity of 100% and no false detections. The method was reassessed in a later study by Osorio et al. (2002) in which longer sections of new depth EEG data were analyzed and results were also quite strong. It is difficult to judge the potential clinical use of the method because seizures were limited to mesial temporal and prefrontal regions, and poor quality EEG was excluded. The wavelet decomposition and feature extraction used in our method was taken directly from the system of Khan and Gotman (2003). In their method, false detections are avoided by rejecting epochs with a mean frequency which is significantly similar to that of a large proportion of background EEG. Evaluation using 66 seizures from 11 patients during 229 h of depth EEG produced a sensitivity of 85.6% and a false detection rate of 0.3/h. The monitoring of background rhythmic activity works well in limiting false detections, but the system would be difficult to use in real time because of the requirement of a large amount of background to be analyzed prior to operation. Also, this

Fig. 6. Missed seizure from patient 6 of the testing data. Activity is clear, sustained and rhythmic, but only appears in 3 channels (the 3 topmost channels shown).

438

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

Fig. 7. Detected seizure from patient 14 of the testing data. Very rhythmic, high-amplitude generalized activity. The 6 channels involved in the detection are marked by bullets and the first detection is marked by the arrow.

system does not deal with the many causes of false detection specific to scalp recordings. The detection system of Qu and Gotman (1995) was designed to ‘provide a warning early in the development of a seizure’ to ‘improve the close observation of seizures and interaction between observers and patients early in the seizure’ (Qu and Gotman, 1995). The system was tested using both scalp and intracranial recordings and performed well enough to detect 100% of seizures with a mean detection delay of 9.6 s and a false detection rate of 0.21/h. The system, however, was not recommended by the authors as a stand-alone clinical detection system because it only detected seizures for which an example, or template, had been provided by an existing clinical detection system or by an electroencephalographer. It also required samples from a broad variety of the patient’s EEG background patterns. The motivation behind our proposed detection system was to provide the same benefit of early detection without the use of patient-specific methodology. It was designed to run on-line and to provide the user with a probability of seizure rather than employ an all-or-nothing approach to detection as do most existing methods. 4.1. Seizures The versatility of our system was shown by the fact that at least one seizure was detected in 44 of the 45 patients in

the combined data set (including the special case). This success was likely due to the use of the high 6 channels in the calculation of PSEZ, which was introduced so that seizures occurring in few channels would be given a similar chance to be detected as seizures involving many channels. This result also suggests that the features chosen are characterizing seizure activity adequately, although, perhaps, some work remains in this area to detect more of the challenging seizures. The separation between seizure and non-seizure epochs seen in the feature data distributions was deemed to be sufficient to create effective a priori probabilities. The large majority of missed seizures were characterized by subtle or focal activity, mixed frequencies, short duration or some combination of these traits. Short seizures with subtle activity are the challenge of any automatic seizure detection system and it is not surprising that this type of seizure was missed, although approximately a third of these were successfully detected. For the system to function adequately in a clinical setting, it was crucial that seizures used in both training and testing were not selected based on their characteristics. Had the data sets been restricted only to clear generalized or focal seizures (with long duration and very rhythmic, large-amplitude changes), for example, we would have been able to report near-perfect sensitivity (only two such seizures were missed in the training set and one in the testing set). Consequently, we feel that the system was successful enough in detecting seizures to be used in

Fig. 8. Detected seizure from patient 15 of the testing data. Sustained, high-amplitude activity occurring in all channels. The 6 channels involved in the detection are marked by bullets and the first detection is marked by the arrow.

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

439

Fig. 9. Detected seizure from patient 1 of the training data. Clear change of mixed frequencies occurring in few channels and lasting less than 8 s. The 6 channels involved in the detection are marked by bullets and the first detection is marked by the arrow.

a clinical setting. An important future step prior to clinical implementation would be to test the system using data from several sources. 4.2. False detections Approximately half of all uninteresting false detections were due to EMG artifact caused by chewing and to rapid eye blinking. Although the system is designed to avoid detecting EMG activity (by scaling the detection variable by a factor inversely related to the presence of EMG, as in Eqs. (4) and (5)) and did so quite successfully, chewing is characterized by intermittent EMG activity and it is likely not enough to lower the detection variable below threshold. Further work has to be done to attempt to reject chewing artifacts, perhaps by rejecting epochs with a large variation in EMG activity. Detections due to eye blinking can perhaps be avoided using methods such as independent component analysis to remove the eye blink artifact from the EEG (Iriarte et al., 2003; Joyce et al., 2004). Manually, this is a fairly easy task; however, to design a system that will automatically remove a certain component, when it is unclear a priori which component that might be, is a more difficult challenge. Short bursts of rhythmic activity caused the other half of uninteresting false detections (theta activity and unusual bursts of mixed frequencies) as well as the great majority of interesting ones (spike and wave complexes and sharp waves). This accounted for a large proportion of overall false detections. The original automatic detection method of

Gotman (1982) had the same weakness. It was improved upon by increasing the use of temporal context by considering information received after the current epoch. If the current activity was not sustained to a certain degree in an 8 s epoch directly following the current epoch, a detection was not made (Gotman, 1990). This was based on the fact that most seizure activity does not disappear so quickly while the opposite is true for bursts of rhythmic nonseizure activity. In the case of the onset detector, these false detections are the price to pay for early detection. As mentioned above, increasing the temporal context would contradict the inherent design of the system, which attempts to make detections with minimal delay. The solution was to provide the tuning mechanism for patients with elevated false detection rates. This is further discussed below. 4.3. Detection delay Early detection was possible due in part to the limited consideration of temporal context, but more significantly by the nature of the probability-based system. Creating the detection variable by simply summing the epoch seizure probabilities over 3 epochs allowed a detection to potentially take place within the first 2 s of seizure activity, if that activity were associated with a high enough probability of seizure. This would be impossible in a method that employed temporal context with conditional detection criteria (e.g. if a certain criteria is met in the current epoch, a second is verified; if both are met, detection occurs only if a third criteria in the following epoch is met, etc.).

Fig. 10. Detected seizure from patient 16 of the testing data. Subtle, not very rhythmic change occurring in few channels. The 6 channels involved in the detection are marked by bullets and the first detection is marked by the arrow.

440

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

Fig. 11. Interesting false detection from patient 18 in the training data caused by a short burst of polyspike activity. The 6 channels involved in the detection are marked by bullets and the first detection is marked by the arrow.

The results show that the method was successful in achieving an acceptable detection delay, with more than half the seizure detections occurring within 10 s of onset. 4.4. User tuneability The many types of seizures in both data sets exhibited a wider range of seizure probabilities than did the majority of false detections. For example, focal seizures with subtle changes had lower PSEZ values than highly rhythmic, generalized seizures. These subtle focal seizures in general were associated with PSEZ values similar to those of most false detections. For this reason, if a global PTH value had been used, the system would either have missed all subtle seizures or included an excessive number of false detections. The tuning mechanism of the system allowed some seizures with low PSEZ values to be detected without introducing all false detections with similar probabilities. After tuning the system according to the mandatory guidelines, the overall false detection rate was significantly decreased in both data sets. Also, the median detection delay was seen to be insensitive to changes in PTH, increasing only a tenth of a second in the training data and less than half of a second in the testing data, from lowest setting to highest. This shows that the tuning mechanism successfully offsets the cost of early detection (i.e. increased false detection rate) without significantly sacrificing detection delay. The criterion for increasing PTH was based on the rate of uninteresting false detections. This was firstly because patients with elevated rates of interesting false detections

were few in both data sets (4 in the training set and 3 in the testing set accounting for 98% of all interesting false detections) and secondly because we wanted the system to alert medical personnel to these events since they are likely related to the patient’s epileptic disorder. In a clinical situation in which it was no longer desirable to be signalled for short epileptic non-seizure events, the tuning could still be applied and the system would most likely perform similarly to what we have shown here. We believe this to be a fair assumption because in general the PSEZ values associated with these false detections were similar to those of uninteresting false detections. It is important to note that the benefits of tuning would only be appreciated in a long-term monitoring scenario in which patients often stay under observation for several days. At least one day would be required to judge whether the false detection rate were too elevated and to adjust the PTH level accordingly. This initial period would not serve only to tune the system, but rather the tuning would occur simultaneously with monitoring to achieve the best possible performance for the remainder of the patient’s stay. 4.5. Electrode artifact rejection The rejection of epochs containing any of the 3 electrode artifacts outlined above was very successful. No false detections occurred due to these types of technical artifacts, while, in fact, several recordings contained a significant amount of these phenomena. Meanwhile, no seizures were missed due to the rejection of these epochs.

Fig. 12. Uninteresting false detection from patient 6 in the testing data caused by a short burst of rhythmic activity of unknown origin. The 6 channels involved in the detection are marked by bullets and the first detection is marked by the arrow.

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

441

Fig. 13. Uninteresting false detection from patient 10 in the testing data caused by rapid eye blinking. The 6 channels involved in the detection are marked by bullets and the first detection is marked by the arrow.

4.6. Performance Running on an AMD Athlon 800 MHz processor (approximately equivalent to an Intel Pentium III of the same speed), the algorithm processed 1 h of 32-channel data in 1 min, 10 s. This performance confirms that the algorithm can successfully be used on-line even with recordings of 128 channels or more. 4.7. Seizure probability The use of Bayesian formulation was considered a success. Since, to the best of our knowledge, this is the first seizure detection algorithm that employs Bayesian formulation, it can be viewed as a feasibility study for which the results were strong enough to warrant potential clinical application. Although the system is based on the probabilities of seizure and non-seizure activity, the detection variable PSEZ is itself not a probability. Future improvements in the method could include an attempt to report on the actual probability of seizure rather than creating a detection variable by summing probabilities. Instead of summing the individual scale probabilities to form PSEZ_CHANNEL, perhaps this variable can be directly assigned the maximum scale probability. Also, instead of summing the PSEZ_CHANNEL values, PSEZ_EPOCH can be redefined as the probability that a certain number of

channels in a seizure epoch, e.g. 6, have probabilities above a certain value. This could be estimated the same way the a priori probabilities were estimated to arrive at PSEZ_i for each scale and would lead to the derivation of an actual probability of seizure for each epoch of EEG. Incorporating temporal context would be a separate challenge. Furthermore, if term 3 of Bayes’ formula (Eq. (1)) is indeed meant to describe the probability of a seizure occurring during monitoring, then it is safe to assume that our assessment of this parameter is inadequate, and more importantly, that sensitivity and specificity are affected by the frequency of occurrence of seizures. Realistically, describing this probability would require a complex study on its own, the parameters of which would be very difficult to assess. The question, What is the probability of a seizure occurring? can be interpreted in many ways. In this case, this term acts simply as a scaling factor derived inherently by the amount of seizure epochs in the training data and it is held constant in the final application. Thresholds for seizure and alpha were chosen according to the difference between final Bayesian probability values that we found during seizure and non-seizure; therefore, changes in term 3 would bring directly proportional changes in PTH and ATH. Essentially, by holding term 3 fixed (and thus the thresholds), we ensure that the probabilities affecting the outcome are the conditional probabilities in term 2. We believe to have derived these in a general way

Fig. 14. Uninteresting false detection from patient 2 in the training data caused by an EMG artifact during chewing. The 6 channels involved in the detection are marked by bullets and the first detection is marked by the arrow.

442

M.E. Saab, J. Gotman / Clinical Neurophysiology 116 (2005) 427–442

that is representative of typical seizure activity and therefore believe the thresholds to be mostly independent of outside factors. Put differently, while the analysis of more training data might significantly change term 3, we believe term 2 would remain relatively constant. The testing results confirm that term 2 has been fairly well estimated by the training data. As mentioned above, further testing using data from different centers would solidify this claim. 4.8. Other recommendations The performance of our onset detection method was compared to the traditional method of Gotman (1982, 1990), because the latter is currently used clinically and this was our aim as well. Comparatively, our system performed very well in every area; significantly better than the traditional method did. The results of the traditional method using the training data were similar to the previous evaluations by Gotman (1990), Pauri et al. (1992) and Salinsky (1997), with a sensitivity of 64.9%; however, the performance on the testing data was far lower than expected (43.2%). This indicates that the testing data set may have included a high proportion of seizures that were particularly difficult to detect and may not be as representative as we would have hoped. If this is indeed the case, then we could expect a better performance of this method with a more representative testing data set. As mentioned above, the system was designed and tested using scalp recordings. In the future, an onset detection system based on a similar approach could be designed specifically for use with intracranial recordings. Acknowledgements This work was supported by grant MT-10189 of the Canadian Institute of Health Research. We are grateful to Ms Lorraine Allard and Ms Nicole Drouin from the Montreal Neurological Hospital for the help in collecting EEGs.

References Burrus CS, Gopinath RA, Guo H. Introduction to wavelets and wavelet transforms: a primer. Englewood Cliffs, NJ: Prentice-Hall; 1998. Daubechies I. Ten lectures on wavelets. Montepelier, VT: Capital City Press; 1992.

Gabor AJ. Seizure detection using a self-organizing neural network: validation and comparison with other detection strategies. Electroencephalogr Clin Neurophysiol 1998;107:27–32. Gabor AJ, Leach RR, Dowla FU. Automatic seizure detection using a selforganizing neural network. Electroencephalogr Clin Neurophysiol 1996;99:257–66. Gotman J. Automatic recognition of epileptic seizures in the EEG. Electroencephalogr Clin Neurophysiol 1982;54:530–40. Gotman J. Automatic seizure detection: improvements and evaluation. Electroencephalogr Clin Neurophysiol 1990;76:317–24. Gotman J, Gloor P. Automatic recognition and quantification of interictal epileptic activity in the human scalp EEG. Electroencephalogr Clin Neurophysiol 1976;49:513–29. Gotman J, Flanagan D, Zhang J, Rosenblatt B. Evaluation of an automatic seizure detection method for the newborn EEG. Electroencephalogr Clin Neurophysiol 1997;103:363–9. Harding GW. An automated seizure monitoring system for patients with indwelling recording electrodes. Electroencephalogr Clin Neurophysiol 1993;86:428–37. Iriarte J, Urrestarazu E, Valencia M, Alegre M, Malanda A, Viteri C, Artieda J. Independent component analysis as a tool to eliminate artefacts in EEG: a quantitative study. J Clin Neurophysiol 2003;20(4): 249–57. Ives JR, Woods JF. A study of 100 patients with focal epilepsy using a 4-channel ambulatory cassette recorder. In: Scott FD, Raftery EB, Goulding L, editors. Proceedings of the III international symposium on ambulatory monitoring. London: Academic Press; 1980. p. 383–92. Joyce CA, Gorodnitsky IF, Kutas M. Automatic removal of eye movement artefacts from EEG data using blind source component separation. Psychophysiology 2004;41(2):313–25. Khan YU, Gotman J. Wavelet-based automatic seizure detection in intracerebral electroencephalogram. Clin Neurophysiol 2003;114(5): 898–908. O’Donnell RD, Berkhout J, Adey WR. Contamination of scalp EEG spectrum during contraction of cranio-facial muscles. Electroencephalogr Clin Neurophysiol 1974;37(2):145–51. Osorio I, Frei MG, Wilkinson SB. Real-time automated detection and quantitative analysis of seizures and short-term prediction of clinical onset. Epilepsia 1998;39:615–27. Osorio I, Frei MG, Giftakis J, Peters T, Ingram J, Turnbull M, Herzog M, Rise MT, Schaffner S, Wennberg RA, Walczak TS, Risinger MW, Ajmone-Marsan C. Performance reassessment of a real-time seizuredetection algorithm on long EcoG series. Epilepsia 2002;43(12): 1522–35. Pauri F, Pierelli F, Chartrian GE, Erdly WW. Long term EEG-video-audio monitoring: computer detection of focal EEG seizure patterns. Electroencephalogr Clin Neurophysiol 1992;82:1–9. Qu H, Gotman J. A seizure warning system for long-term epilepsy monitoring. Neurology 1995;45(12):2250–4. Salinsky MS. A practical analysis of computer based seizure detection during continuous video-EEG monitoring. Electroencephalogr Clin Neurophysiol 1997;103:445–9.