Computer Methods and Programs in Biomedicine (2004) 76, 261—263
SignalML from an EDF+ perspective Bob Kempa,b,∗ a b
Department Neurology, Leiden University Medical Centre, Leiden, The Netherlands Sleep Centre, MCH—–Westeinde Hospital, Den Haag, The Netherlands
Received 26 May 2004; accepted 28 May 2004 KEYWORDS EDF; EDF+; SignalML; Data format
Summary Both SignalML and EDF+ offer a solution for the incompatibility between different data storage formats in biomedicine. This article discusses the SignalML approach from an EDF+ point of view. © 2004 Elsevier Ireland Ltd. All rights reserved.
1. Introduction Elsewhere in this issue, SignalML is proposed as a solution for the incompatibility between different data storage formats in biomedicine. Another solution is the European Data Format (EDF, and its recent extension, EDF+), to which the authors refer extensively. Here, I will briefly summarize the main characteristics of EDF and EDF+ and discuss SignalML from an EDF+ perspective.
2. The European Data Format (EDF and EDF+) EDF is a freely available simple standard format [1] for storage and exchange of biological signals (time series). An EDF file first identifies the patient and specifies the time period and some general characteristics of the recording. Then it specifies the technical characteristics of each signal, such as cal* Tel.: +31 71 5262188/3302205; fax: +31 70 3882636.
E-mail address:
[email protected] (B. Kemp).
ibration and sampling frequency. Finally, the actual signals follow. A simple extension, EDF+, was added recently [2] in order to accommodate annotations, interrupted recordings, standard electrode names and more. EDF+ defines the EDF fields more strictly and specifies the technical characteristics of one of the signals in a special ‘EDF Annotations’ way. About 50 companies currently support EDF. Many more researchers and programmers do so and several have made their EDF viewers, analysers, tools and files freely available. Some is open source software. All existing EDF software correctly handles all signals in uninterrupted EDF+ files. EDF+ files as well as independent programs that accommodate both EDF and EDF+ are already available and more is on its way. Details are in the mentioned publications and on http://www.hsr.nl/edf. In the typical situation, commercial equipment records the data into EDF files. Then, medical staff uses any EDF compatible software (commercial or freely available) to view and analyse that data. It is expected that most EDF applications will gradually begin to support EDF+ as well. EDF and EDF+ can
0169-2607/$ — see front matter © 2004 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.cmpb.2004.05.008
262 live together because they are largely compatible to each other.
3. Discussion SignalML decodes signals from various formats into its own standard, the “minimum parameter set”. This set is quite similar to EDF and EDF+, but more flexible. In particular, it can handle signals that are stored with more than 16 bits accuracy. This advantage is hardly relevant for biological signals because these have a signal-tonoise ratio of less than 14 bits. Visual presentation of floating-point analysis results (in plots or numbers) requires even less accuracy, and can therefore be done by EDF+ using the patch [3] on http://www.hsr.nl/edf/edffloat.htm. Nevertheless, SignalML offers higher accuracy for intermediate storage of floating-point time series. More flexibility particularly applies to the SignalML Annotations file. The XML approach in SignalML can put almost any annotations scheme in this file. This is perfect when producing annotations. But for every new scheme, all software that handles SignalML annotations must be updated in order to meaningfully process that scheme. In other words, flexibility in the generation of annotations is paid for by complexity in their handling, and frequent software updates. The ‘EDF Annotations’ signal also accommodates artefacts, triggers, markings, free text, sleep stages, and so on. But the flexibility is limited in such a way that only a few handling schemes need to be programmed. In fact, several EDF+ annotation texts implement external standards in such a way that annotations are already standardized at production time. New standard texts can be defined, a feature that is currently being exploited to link annotations to specific signals. The flexibility of SignalML can be attractive in biomedical research involving complex and variable patient protocols. The relative rigidity of EDF+ is appreciated in commercial, medical research and health care environments. EDF+ and SignalML have different approaches to link annotations to signals. SignalML annotation files know the signal and the file to which they belong (but not yet vice versa). EDF+ can store both annotations and signals in the same file, more firmly linked to the same clock and the same patient. Any separate EDF+ files for signals and annotations must obey a filename standard so that they can find each other. In SignalML, data producers (often companies) and data readers communicate through the format
B. Kemp decoders, rather than through a standard file. This, indeed, has the advantage that data need not be duplicated into an EDF+ file. However, the details of SignalML are on the eeg.pl pages and may be changed if more formats and more applications would be realized. Therefore, old and new SignalML-based software needs collaboration with the companies and/or the availability (and maintenance) of the eeg.pl webpages in order to get the correct decoder information. This dependency is repeated with each update of the SignalML specification, each SignalML application, each company, and each new version of the company’s software. Whether SignalML will support many more formats than just EDF and EASYS, and whether enough independent applications will be realized, remains a challenge to this collaboration. The format decoders must actually be developed and all SignalML software must implement the decoder plugin. This extra work may be a problem at the side of the data readers, which is illustrated by the fact that SignalML does not yet decode annotations from an EDF+ file. In contrast, the structure of EDF+ is fixed in a paper publication. The company needs to invest only once, in order to produce EDF+ files. And EDF+ software does not require any decoder plug-ins. Finally, hospitals need to keep their recordings accessible for a long time. They cannot keep the corresponding old software because this usually implies maintaining old hardware and operating systems, as well as the impossibility to use new techniques. Therefore, their new software must be able to read old files as well. Based on EDF experience, about 50 companies invent a data format (or a new version) every 5 years. For a memory of 20 years, the SignalML metainformation files must therefore support about 1000 formats. And the SignalML application software must be able to implement the corresponding decoder plug-ins and discriminate the files to which they must be applied. Who will keep track of all this? In contrast, a 20-year memory of EDF/EDF+ software (and maybe in 10 years from now EDF++) is based on only its own 2 (or maybe 3) formats and these are largely compatible to each other.
4. Conclusion The SignalML approach is more flexible than EDF+, but depends heavily on an ongoing collaboration between companies, programmers, and the eeg.pl managers. EDF has demonstrated its usefulness and became widely accepted in the past 12 years. EDF+ has even more potential because it adds important
SignalML from an EDF+ perspective functionality, while largely maintaining EDF compatibility. The practical usefulness of SignalML must yet be demonstrated. The two solutions are currently incompatible.
References [1] B. Kemp, A. V¨ arri, A.C. Rosa, K.D. Nielsen, J. Gade, A simple format for exchange of digitized polygraphic record-
263 ings, Electroenceph. Clin. Neurophysiol. 82 (1992) 391— 393. [2] B. Kemp, J. Olivan, European data format ‘plus’ (EDF+), an EDF alike standard format for the exchange of physiological data, Clin. Neurophysiol. 114 (2003) 1755— 1761. [3] B. Kemp, T. Penzel, A.O. V¨ arri, P. Sykacek, S.J. Roberts, K.D. Nielsen, EDF: a simple format for graphical analysis results from polygraphic SIESTA recordings, J. Sleep Res. 7 (2) (1998) 132.