Smart alarms from medical devices in the OR and ICU

Smart alarms from medical devices in the OR and ICU

Best Practice & Research Clinical Anaesthesiology 23 (2009) 39–50 Contents lists available at ScienceDirect Best Practice & Research Clinical Anaest...

229KB Sizes 4 Downloads 190 Views

Best Practice & Research Clinical Anaesthesiology 23 (2009) 39–50

Contents lists available at ScienceDirect

Best Practice & Research Clinical Anaesthesiology journal homepage: www.elsevier.com/locate/bean

4

Smart alarms from medical devices in the OR and ICU Michael Imhoff, Priv.-Doz. Dr. med., Associate Professor of Medical Informatics and Statistics a, *, Silvia Kuhls, Dr. rer. nat., Senior Research Scientist b, Ursula Gather, Prof. Dr. rer. nat., Professor of Statistics b, Roland Fried, Prof. Dr. rer. nat., Professor of Statistics b a b

¨t Bochum, 44780 Bochum, Germany ¨ r Medizinische Informatik, Biometrie und Epidemiologie, Ruhr-Universita Abteilung fu ¨t Statistik, Technische Universita ¨t Dortmund, 44221 Dortmund, Germany Fakulta

Keywords: alarm algorithms alarm systems device alarms false alarms statistical time-series analysis

Alarms in medical devices are a matter of concern in critical and perioperative care. The high rate of false alarms is not only a nuisance for patients and caregivers, but can also compromise patient safety and effectiveness of care. The development of alarm systems has lagged behind the technological advances of medical devices over the last 20 years. From a clinical perspective, major improvements in alarm algorithms are urgently needed. This review gives an overview of the current clinical situation and the underlying problems, and discusses different methods from statistics and computational science and their potential for clinical application. Some examples of the application of new alarm algorithms to clinical data are presented. Ó 2008 Elsevier Ltd. All rights reserved.

Introduction Patients receiving critical and perioperative care are surrounded by a multitude of medical devices. Monitoring devices (patient monitors) measure and monitor vital signs; therapeutic devices support or replace impaired or failing organs and administer medications and/or fluids (e.g. ventilators, renal replacement therapy). All these devices have alarm capabilities and generate visual and acoustic alarms to alert healthcare staff to either a change in the patient’s condition or a malfunction of the equipment. From a clinical point of view, an alarm is an automatic warning that results from a measurement, or any other acquisition of descriptors of a state, to indicate a relevant deviation from a normal state. In

* Corresponding author. Am Pastorenwa¨ldchen 2, D-44229 Dortmund, Germany, Tel.: þ49 231 973022 0; Fax: þ49 231 973022 31. E-mail address: [email protected] (M. Imhoff). 1521-6896/$ – see front matter Ó 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.bpa.2008.07.008

40

M. Imhoff et al. / Best Practice & Research Clinical Anaesthesiology 23 (2009) 39–50

this context, alarms are triggered early when any relevant abnormality of vital-organ function or essential-device function occurs. Thus, alarms improve patient safety and quality of care.1 Of course, the final goal of alarms could be to detect changes early and suggest appropriate action. Types of alarm Following these general requirements, there are different goals for device alarms; these follow a certain hierarchy:  Detection of life-threatening situations: Detecting and warning of life-threatening situations (e.g. asystole, apnoea) was the original purpose of monitoring alarms. False-negative alarms are not acceptable in such situations because of the imminent danger of severe patient harm or death.  Detection of life-threatening device malfunction: This capability is essential for all organ-support devices, which must recognize malfunctions, such as disconnection from the patient; occlusion of the connection to the patient; disconnection from power, gas or water supply; and internal malfunction.  Detection of imminent danger: The early detection of gradual change that might be indicative of imminent danger.  Detection of imminent device malfunction: The early detection of device-related problems that could result in malfunction is an integral part of many therapeutic devices. These warning mechanisms range from simple aspects (e.g. low-battery warnings) to complex algorithms and sensors that track the wear of respiratory valves.  Diagnostic alarms: These alarms indicate a pathophysiological condition (e.g. septic shock) rather than warning about out-of-range variables. Alarm-related problems The rate of false alarms The medical literature contains many studies into the problems of alarms in critical care monitoring, and a few studies into alarms from therapeutic devices. These studies show that up to 90% of all alarms in critical-care monitoring are false positives. In many cases, they result from measurement and movement artefacts. The vast majority of all threshold alarms in the intensive care unit (ICU) do not have a real clinical impact on the care of the critically ill.2–5 In one study, 72% of all alarms did not result in any medical action.1 The study showed that the positive predictive value was only 27%, and the specificity only 58%. The negative predictive value and the sensitivity were 99% and 97%, respectively. These results are supported by earlier studies that report a rate of relevant alarms as low as 10%4 or 5%.3 Whereas 27% were induced alarms (i.e. caused by nursing procedures) 68% were truly false alarms.3 The positive predictive value ranged from 5 to 16%, depending on the monitored variable. Our own data show that more than 40% of all alarms result from manipulation.6 Among therapeutic devices, only alarms from ventilators have been investigated closely. One study showed that, in comparison to monitoring alarms, the sensitivity and specificity of alarms from ventilators were not significantly different from ECG monitors in paediatric patients.3 However, this study combined technically false and clinically false alarms. In another study, the rate of interventions following ventilator alarms was not different from that for monitoring alarms. However, in the case of ventilator alarms, the interventions were typically modifications of therapy (e.g. suctioning, readjustment of ventilator settings), whereas interventions triggered by monitoring alarms mostly resulted in sensor readjustments.1 Clinical perception of false alarms The huge incidence of false alarms results in a dangerous desensitization of the intensive care staff toward true alarms.7 Moreover, alarm limits might be set dangerously wide or are even completely disabled to reduce the nuisance from false alarms.8

M. Imhoff et al. / Best Practice & Research Clinical Anaesthesiology 23 (2009) 39–50

41

The clinical situation is further aggravated by the fact that there are also different alarms–including different acoustic or optic annunciations–for the same clinical alarm situation. Clinical studies show that critical-care nurses can identify only the minority of all audible alarms correctly.9,10 Types of false alarm False alarms are caused not only by technical problems and artefacts but also by the clinical inappropriateness of the alarm itself, the alarm limit or the alarm algorithm, although the alarm itself might be technically correct. From a medical perspective, false alarms fall into three categories. First, technically false alarms correspond to situations in which alarms respond to a specific variable, although this variable in reality does not surpass any preset threshold. Examples are an asystole call because of a low voltage lead, motion artefacts, falsely low SpO2 readings in hypothermic patients or a wet flow sensor of a ventilator. Second, clinically false alarms are situations that are not clinically relevant although the variable is actually beyond the preset alarm threshold. Examples are intermittently increased heart rate in patients with absolute arrhythmia, where the increased heart rate might last only a few seconds, or a patient during weaning from mechanical ventilation who takes a deep spontaneous breath exceeding the tidal volume alarm limit. Third, false alarms through interventions are actually both technically and clinically false alarms. They are caused by medical or nursing interventions, such as moving or positioning the patient, drawing blood and flushing the a-line, or disconnecting the patient from the ventilator for endotracheal suctioning. In one study, about one-third of all alarms originated from the ventilator, another third from the cardiovascular monitor, and some 15% from pulse oximetry alone.1 Another study found pulse oximetry to cause over 40% of all false alarms.3 In patients with invasive blood-pressure monitoring, arterial blood pressure alarms are often the leading cause of false alarms.11 New alarm algorithms Clinical experience and scientific studies show that the quality of medical device alarms is unsatisfactory to the degree that the poor quality not only annoys caregivers and hampers their work, but also affects quality of care and patient safety. Although these problems are multifaceted, one root cause is the poor quality of alarm-generating algorithms. Requirements for alarms and alarm algorithms The basic requirements for alarms from medical devices include:    

All life-threatening situations shall be detected and alarmed. Patient-related alarms shall be differentiated from equipment-related alarms. Devices shall warn before life-threatening situations occur. Devices shall give physiological/diagnostic information with reference to an alarm situation.

Sensitivity and negative predictive value for life-threatening situations should be very close to 100%, while positive predictive values and specificity should be as high as possible. The latter is actually the problem for most alarm algorithms used today in commercially available medical devices. Methodological approaches to alarm detection The four areas in which alarms can be improved are interdependent and build on each other: (1) signal acquisition, i.e. the interface between patient and medical devices (e.g. sensor and physiological front-end); (2) alarm generation, i.e. the algorithms that determine an alarm situation for each measured variable; (3) alarm validation, i.e. determining whether the alarm is actually valid (clinically/ technically); (4) integration of multiple alarms, e.g. from different devices, into one/fewer alarms.

42

M. Imhoff et al. / Best Practice & Research Clinical Anaesthesiology 23 (2009) 39–50

Every method that is used to enhance the clinical and technical quality of monitoring alarms must fulfil certain methodological criteria. These include: robustness against artefacts and missing values, real-time and online application, predictable behaviour and methodological rigor. The basic problem is that most alarms are based on simple numeric thresholds. Three general approaches can be distinguished: 1. Improvement of the univariate alarm algorithm, e.g. by improving the detection of patterns of change in a specific variable. In this case, both the clinical problem and the approach to solving this problem are univariate. 2. Incorporation of information derived from variables other than the one under observation. In this case, the clinical problem is still univariate, but the approach to solving the problem is multivariate. 3. Multivariate alarms, in this context, denote alarms resulting from the simultaneous analysis of more than one monitored variable. Multivariate can mean that the results of univariate analyses are combined, for instance by logical combination, or that truly multivariate analyses (e.g. principal component or factor analyses) are performed. Whereas the latter approach still presents both methodological and interpretation challenges (see the overview in ref.12), the logical combination of univariate analyses has been tried in clinical studies.13

Methods of alarm detection Different methods have been proposed and investigated for use in the alarm systems of medical devices, mostly from the fields of statistics and artificial intelligence. This section shall give a brief overview of different methods (see the detailed review in ref.14). Statistical approaches Improved data preprocessing As many superfluous alarms are due to artefacts or short fluctuations in the physiological time series, one idea is to generate alarms using a filtered signal instead of the raw data. Robust signal extraction. The underlying signal can be extracted from a noisy physiological time series by employing a robust regression technique, e.g. the repeated median15, in a moving time window.16 This approach can be further improved by integrating online outlier replacement rules17 and by reducing the time delay of the estimation.18 Fast algorithms make these approaches suitable for online, real-time monitoring applications.19 Clinical example. The extracted signal can be used for generating threshold alarms. Figure 1 shows two online-monitoring time series for systolic arterial blood pressure together with the simple threshold alarms that were given by a conventional monitoring system. A simple threshold alarm occurs when the measurements lie outside the specified alarm limits for four consecutive seconds. Online repeated median signal extraction with a window width of 91 seconds was used for generating signal-based threshold alarms. The first example shows the potential of this approach for avoiding alarms due to measurement artefacts. In the second example, one long signal-based threshold alarm is given instead of several (in this case, eight) short alarms for the same alarm situation. Segmentation. An online-segmentation procedure is proposed, which splits the data into line segments of different lengths by combining linear least squares regression for data approximation, the cumulative sum (CUSUM) technique for determining whether the current approximation is still acceptable, and artefact rejection based on fixed thresholds.20 The line segments are used for generating threshold alarms. Additionally, the segments are classified into shapes (e.g. decreasing, increasing, steady), which are employed for detecting specific clinical situations according to certain rules.21 Median filter. Ma¨kivirta et al.22 monitored intensive care time series by two median filters: one with a short window width in combination with wide alarm limits and a one with a large window width and

M. Imhoff et al. / Best Practice & Research Clinical Anaesthesiology 23 (2009) 39–50

|

systolic arterial pressure simple threshold alarm extracted signal

| |

200

43

systolic arterial pressure simple threshold alarm extracted signal signal-based threshold alarm

180

200

160

180

mmHg

mmHg

220

140 120

160 140 120 100

100 10:10

10:15

10:20

time

10:25

10:30

14:11:00 14:13:15 14:15:30 14:17:45 14:20:00

time

Fig. 1. Two examples with actual clinical monitoring data (patient with invasive blood pressure monitoring) for threshold alarms based on the extracted signal in comparison to conventionally used threshold alarms.

narrower limits. They found they could reduce the frequency of false-positive alarms from 12 to 4.5 per hour without significantly compromising clinical sensitivity. Statistical process control Control charts like the Shewhart chart23 can be employed to detect ‘out-of-control’ states in a process. Kennedy24 used a modified version of Trigg’s Tracking Variable25 to detect the onset of changes in systolic blood pressure. An overview of other early process control procedures for patient monitoring is given in ref.26 Conventional control charts are based on a fixed target value. This is problematic for medical applications because there is not a unique target value for different individuals or health states. Moreover, conventional process control methods do not account for autocorrelations and do not differentiate between different patterns of change, such as outliers, level shifts and trends. Time series analysis for pattern detection Since the late 1980s, time-series analysis has been increasingly applied in ICUs. Time-series techniques have been confirmed to be suitable for retrospective analysis of physiological variables.27 However, the online analysis of monitoring data poses further challenges. Dynamic linear models and Kalman filters. Smith et al.28 proposed multiprocess dynamic linear models for monitoring laboratory values after kidney transplantation. This approach was validated by Trimble et al.29 and modifications for biomedical time series are presented in refs.30 and31 More recently, Yang et al.32 used adaptive Kalman filtering in combination with EWMA (Exponentially Weighted Moving Average) prediction and CUSUM testing to identify change-points in heart-rate time series. So far, none of these approaches has been implemented in commercial products. One of the reasons is their strong sensitivity to misspecification of the model parameters. Autoregressive models and self-adjusting thresholds. Autoregressive (AR) models have also been applied to the analysis of intensive care data. Low-order AR models are sufficient for the modelling of physiological variables in a steady state. For pattern detection (i.e. the identification of deviations from a steady state), the incoming observations can be compared to confidence intervals for the predictions. This approach showed promising performance in detecting clinically relevant patterns in monitoring time series.27,33,34 Phase space embedding. Gather et al.35 transformed the dynamic information of medical time series into geometric information by regarding several consecutive observations as a point in the Euclidean space.

44

M. Imhoff et al. / Best Practice & Research Clinical Anaesthesiology 23 (2009) 39–50

This so-called phase-space (PS) representation was integrated into a procedure that can be regarded as a Shewhart control chart for dependent data. Clinical example. For a comparison between AR, PS and dynamic linear models (DLMs), a dataset consisting of 550,000 single observations of seven variables (heart rate and invasive blood pressure) was acquired from critically ill patients. Time-series segments (150 to 500 observations) for each variable were visually classified by a senior intensivist into the patterns ‘no change’, ‘outlier’, ‘level change’, and ‘trend’. The same segments were analysed with second-order AR models, PS models and DLMs.36,37 This comparison gave the following results: AR models identified all series with outliers, level changes and without a change correctly, i.e. the classification was identical to that of the intensivist. The PS approach always identified series without any change and also with outliers. The identification of level changes failed if the decrease or increase of the observed values was rather slow (Table 1). The possibility of online trend detection is the main advantage of DLMs as this is not possible with either AR or with PS models. From a clinical perspective, these results show that a single method alone cannot adequately meet all requirements for univariate pattern detection. It appears advantageous to combine different methods and to fine-tune each for a specific set of patterns. Trend detection and curve fitting. Schoenberg et al.38 applied a trend detection algorithm, whereby a trend was defined by a score calculated from scoring values. These scoring values were derived from criteria of the time series, which were evaluated at predefined time intervals. A trend was detected when the sum of all scores exceeds a trigger value. In an earlier study, trends were detected by least squares regression applied to median filtered values in a moving window. The reliability of a trend was determined from the estimated error variance.39 Fried and Imhoff40 adapted Brillinger’s41 approach for retrospective detection of a monotonic trend to the online context by analysing the data in a moving time window. Time-varying autocorrelations were estimated online. Haimowitz et al.42 developed the program TrenDx for diagnosing paediatric growth disorders and identifying clinically relevant trends in haemodynamics and blood gases in intensive care. Dimension reduction Graphical models. Reducing the dimension of the data on which decisions are based can have several advantages. One way to achieve a dimension reduction is to select a subset of the most important variables. A statistical approach is to use graphical models43, which enable the assessment of partial linear and possibly time-lagged associations between all pairs of variables while controlling for the effects of all other variables. Principal component and factor analysis. Another approach to dimension reduction is to search for combinations of the observed variables that contain as much information as possible. Principal component analysis aims to find a parsimonious description of the variability in the data. Factor analysis assumes that a few latent variables (factors) drive the series and cause the correlations between the observable variables. A case study using dynamic factor analysis for monitoring time series showed that a statistical dimension reduction is feasible, but that the resulting factors might not be intuitively interpretable by the caregiver.44 Table 1 Comparison of autoregressive (AR) models, phase-space (PS) models and dynamic linear models (DLM). Numbers of correct classifications among the total number of patterns given in brackets. Details in text. Clinical pattern

AR

PS

DLM

No change (23) Outlier (35) Level change (66) Trend (10)

23 35 66 0

23 35 57 0

23 35 43 6

M. Imhoff et al. / Best Practice & Research Clinical Anaesthesiology 23 (2009) 39–50

45

Clinical example. Online monitoring data were acquired from 25 consecutive critically ill patients (9 female, 16 male, mean age 66 years) with extended haemodynamic monitoring requiring pulmonary artery catheterization, amounting to a total of 129,943 sets of observations.45 In good agreement with medical knowledge, the fitted graphical models for all patients revealed strong relationships between the arterial pressures, between the pulmonary artery pressures and between heart rate and pulse. The relation between the systolic and the diastolic pressure was always weaker than between each of these and the corresponding mean pressure. Application of a stepwise search strategy revealed that central venous pressure (CVP) is most strongly related to the pulmonary artery pressure, whereas the temperature does not show a strong relationship to any of the other variables (Fig. 2). Such a partitioning of the variables into strongly related subgroups is useful for variable selection. Alternatively, it allows performing separate factor analysis for the identified subgroups resulting in faster estimations and better interpretability of the results. The latter approach means a compromise between variable selection and principal component analysis as the percentage of explained variability will typically be higher than for a variable selection and still lead to meaningful latent variables. Artificial intelligence Several methods from the field of artificial intelligence (AI) have been applied in intensive care and anaesthesiology (see the overview in refs46 and47). In the following, selected AI approaches to patient monitoring are reviewed. Knowledge-based approaches Initially, rule-based systems were developed using expert knowledge in explicit decision trees. Although this approach is cumbersome, several studies reported encouraging results, e.g. a significant reduction in response time of the clinician toward the cause of the alarm.48 Sukuvaara’s group reported on a knowledge-based system for alarm-based diagnoses.13 Although investigations indicated that this approach performs well with respect to the correct detection of

HR APD

PAPD

APM

PAPM

APS

PAPS

Temp CVP strong Puls

moderate weak

Fig. 2. Partial correlation graph, final selection. Different line types depict different strength of partial correlation. Details in text. APD, diastolic arterial pressure; APM, mean arterial pressure; APS, systolic arterial pressure; CVP, central venous pressure; HR, heart rate; PAPD, diastolic pulmonary artery pressure; PAPM, mean pulmonary artery pressure; PAPS, systolic pulmonary artery pressure; Puls, pulse rate from plethysmography; Temp, blood temperature.

46

M. Imhoff et al. / Best Practice & Research Clinical Anaesthesiology 23 (2009) 39–50

pathological states49, this research project did not result in further publications or a commercial product despite significant industrial funding. Knowledge discovery based on machine learning Machine-learning methods seem to be very promising for the acquisition, revision, refinement and extension of knowledge bases. Examples include knowledge-based systems with revision and learning capabilities or support vector machines.50 Imhoff et al.51 and Morik et al.52 developed a system that combines pattern detection using PS models35, support vector machines53 for learning rules for the appropriate intervention for a given physiological state and a first-order logic representation54 for modelling medical knowledge in terms of rules for the expected effect of a given intervention. Another system for monitoring and therapy planning for mechanically ventilated newborn infants also combined numerical data and a knowledge base.55 Other approaches to learning decision rules from simulated or real-world data include clinical work by Mu¨ller et al.56 based on theoretical work by Talmon57 on multiclass non-parametric partitioning algorithms. Neural networks Several studies examined the use of neural networks for the construction of alarm systems. Ulbricht et al.58 employed neural networks in an alarm system that reported suspicious patterns in cardiotocogram data. Farell et al.59 and Orr and Westenskow60 developed a neural-network-based alarm system to identify several specific faults in an anaesthesia breathing circuit. Tsien61 used neural networks to detect events of interest in the vital signs of paediatric ICU patients. This method has drawbacks, however: the learning behaviour is not reproducible, a long training phase is required and training is impossible for primarily instable patients. Random forests Tsien62 and Zhang63 successfully applied decision trees to physiological data from ICU monitors. However, artefacts and missing values are often encountered in monitoring time series and can dramatically affect the tree structure. Moreover, overfitting might occur on large variable sets. Sieben and Gather64 used random forests65 to overcome these problems and proposed a procedure for the discrimination of true and false alarms that allows for the specification of a target sensitivity. A random forest is an ensemble of decision trees. Every tree is grown on an independently and randomly chosen subset of the learning sample. The classifications of the objects, in our case the alarms, by the individual trees are understood as votes for its population membership. The voting for class membership is not strictly democratic but the number of votes in favour of ‘false alarm’ that leads to classification as such by the whole forest (the so-called critical value) is chosen such that the sensitivity for detecting true alarms is under control. Clinical example. Complete high-resolution monitoring data were acquired from 55 critically ill patients. All alarms were annotated as ‘true’ or ‘false’ by an intensivist. The database, which comprised 2749 annotated alarm situations, was divided randomly into learning, estimation and test sets 1050 times. In total, 1050 forests, each consisting of 1000 trees, were grown on the learning samples. The critical value of each forest was calculated from the estimation set. The performance of each forest was evaluated by applying it, together with its critical value, to the test set. With a target sensitivity of 98%, false alarms were reduced by 30.1% on the average.66 Fuzzy logic The concept of fuzzy logic67 has proven to be useful in situations where it is problematic to describe the process of interest with a precise mathematical model. Therefore, this approach might be suitable for intensive and perioperative care, where experience and intuition play an important role in decision making.68 An example from intensive care is the difficulty of determining a precise alarm threshold for a monitored variable. After determining fuzzy sets for the variables of interest, rules can be defined that classify the different combinations of the sets according to expert knowledge and decide about an alarm. Fuzzy logic has been employed for the development of alarm systems for anaesthesia69,70, for the monitoring of preterm infants71 and for alarm validation.72,73

M. Imhoff et al. / Best Practice & Research Clinical Anaesthesiology 23 (2009) 39–50

47

Bayesian networks Bayesian networks, also called causal probabilistic networks, can be used in ICU monitoring for calculating and updating probabilities for the occurrence of interesting events. Laursen74 designed a network that uses as its input the observations of a set of physiological variables (e.g. mean arterial pressure, central venous pressure). Each time new observations of these variables are obtained, the probabilities for the events ‘cardiovascular patient event’ and ‘measurement error’ are updated using Bayes’ theorem; these form the output of the system. A disadvantage of Bayesian networks is that a large amount of prior information is needed, because the dependencies between the different variables have to be defined and quantified in terms of conditional probabilities. Discussion Alarms and diagnostic features of medical devices have benefited from technological advances in medical device technology over the last 25 years, e.g. in the digital detection of arrhythmia events or in signal processing for pulse oximetry. However, the majority of alarms are still simple threshold alarms, which generate a lot of false positives without real clinical meaning.1,3 Today, the majority of monitoring alarms are still false positive.5,75 It even appears that the development of medical device alarms has been decoupled from the general development of medical technology. One important reason for this is that, in the current legal and regulatory environment, manufacturers have a vital interest in the most sensitive alarm algorithms, such that no critical event goes undetected. This leaves the responsibility with the caregivers, who often disable alarms because of their excessive false-positive rates. However, the root cause of patient harm in such a scenario would be the poor quality of alarm algorithms. Regulatory restrictions might make it difficult to get approval for alarm algorithms that have a reduced sensitivity to non-life-threatening events. It is also to be expected that manufacturers will try to avoid any perceived exposure to liability issues. Another reason for the slow progress may be that the market will not honour better alarm algorithms to the extent that it makes commercial sense for the manufacturers. In any case, improved methods for online alarm detection seem to be needed from the perspective of the clinical user. Statistical methods offer possibilities for signal extraction, artefact filtering, process control, pattern detection and dimension reduction. Signal extraction and artefact filtering can be seen as methods for preparing the physiological time series for further analysis. Statistical time-series analysis is better suited for the recognition of clinically relevant patterns like outliers, level shifts and trends. Many AI methods, especially the widely used neural networks, are not fully deterministic, and therefore are not completely predictable in their behaviour, for instance when they are trained on the currently monitored patient. This can not only cause clinical problems but can be a significant obstacle to regulatory approval. Therefore, in most applications of neural or Bayesian networks, the learning capabilities are only used in the development of the decision functions. These functions are then ‘frozen’ for the final clinical applications. But this process raises further questions about the validity of the historic learning approach:    

Which patient populations should be used for learning? What are the correct definitions of, for example, a truly positive alarm situation? How large a dataset is needed? Does the training sample represent the wide range of patients for whom the final device/system will be used?

These issues actually arise for most methods that require some period of ‘learning’. Summary The alarms of medical devices, especially the high rate of false alarms, are a matter of concern in critical and perioperative care. Clinical experience and scientific studies show that the quality of medical device alarms is unsatisfactory, to the degree that the poor quality not only annoys caregivers

48

M. Imhoff et al. / Best Practice & Research Clinical Anaesthesiology 23 (2009) 39–50

and hampers their work, but also affects quality of care and patient safety. Although these problems are multifaceted, one root cause is the poor quality of alarm-generating algorithms. New alarm methods need to fulfil a number of methodological criteria, such as robustness, real-time and online capabilities, methodological rigor and applicability to large patient populations. Univariate and multivariate methods have been proposed and investigated, mostly from the fields of statistics and artificial intelligence. Some have even shown encouraging results in clinical studies. An array of methods from different fields appears to bear promise for a new comprehensive alarm detection concept. But there are still methodological, computational, clinical and regulatory challenges with some of these methods in a general patient population. Further research into new alarm algorithms is therefore needed to improve alarm handling in medical devices. This requires not only methodological rigor and a good understanding of the clinical problems to be solved, but also adequate matching of statistical and computer science methods to clinical applications. It is important to see the broad picture of requirements for monitoring and therapy devices in general patient populations. Many researchers have chosen one or two method(s) without prior analysis of alternative methods and applied them to one highly specific clinical problem. Moreover, without further improvements in signal acquisition and the robustness of signal extraction, many promising alarm-detection and alarm-validation methods cannot live up to their true potential. It is important to keep the sequence of signal acquisition/extraction, alarm detection and alarm validation in mind when designing new studies of medical device alarms. As well as methodological research, extensive validation and verification against ‘real-world’ data is needed to assure the safety and efficacy of new algorithms. This requires the acquisition of clinically annotated monitoring and device data in adequately large clinical studies. It also requires close and trustful cooperation between researchers and industry.

Practice points  Medical devices in the OR and on the ICU, especially physiological monitors and life-support equipment, have a high rate of false alarms. This is not only a nuisance for patients and caregivers, but can also compromise patient safety and effectiveness of care.  From a clinical perspective, major improvements of alarm algorithms are urgently needed.  An array of methods from different fields appears to bear promise for a new comprehensive alarm detection concept. But significant research and development efforts are still needed.  New alarm algorithms are only one approach in the quest of clinically more useful and less burdensome alarm systems for medical devices in the OR and ICU.

Research agenda  Although some new alarm algorithms are ready for integration into clinical alarms systems, further methodological research is needed, especially in the area of multivariate algorithms.  Methodological, computational, clinical and regulatory challenges will have to be addressed in all major patient populations.  New alarm algorithms have to be tested against clinical data from sufficiently diverse patient populations.  Beside improved alarm algorithms also issues of alarm annunciation, user interfaces and device integration have to be addressed in order to significantly improve alarm systems in medical devices in OR and ICU.

Acknowledgements Supported, in part, by the German Research Foundation (DFG, SFB 475).

M. Imhoff et al. / Best Practice & Research Clinical Anaesthesiology 23 (2009) 39–50

49

References *1. Chambrin MC, Ravaux P, Calvelo-Aros D et al. Multicentric study of monitoring alarms in the adult intensive care unit (ICU): a descriptive analysis. Intensive Care Medicine 1999; 25: 1360–1366. 2. O’Carroll TM. Survey of alarms in an intensive therapy unit. Anaesthesia 1986; 41: 742–744. 3. Lawless ST. Crying wolf: false alarms in a pediatric intensive care unit. Critical Care Medicine 1994; 22: 981–985. 4. Koski EM, Ma¨kivirta A, Sukuvaara T & Kari A. Frequency and reliability of alarms in the monitoring of cardiac postoperative patients. International Journal of Clinical Monitoring and Computing 1990; 7: 129–133. *5. Tsien CL & Fackler JC. Poor prognosis for existing monitors in the intensive care unit. Critical Care Medicine 1997; 25: 614–619. 6. Imhoff M, Kuhls S, Gather U et al. Clinical relevance of alarms from patient monitors. Critical Care Medicine 2007; 34(suppl): A62. 7. Meredith C & Edworthy J. Are there too many alarms in the intensive care unit? An overview of the problems. Journal of Advanced Nursing 1995; 21: 15–20. *8. Koski EM, Ma¨kivirta A, Sukuvaara T & Kari A. Clinicians’ opinions on alarm limits and urgency of therapeutic responses. International Journal of Clinical Monitoring and Computing 1995; 12: 85–88. 9. Cropp AJ, Woods LA, Raney D & Bredle DL. Name that tone: the proliferation of alarms in the intensive care unit. Chest 1994; 105: 1217–1220. 10. Momtahan K, Retu R & Tansley B. Audibility and identification of auditory alarms in the operating room and intensive care unit. Ergonomics 1993; 36: 1159–1176. 11. Siebig S, Scho¨lmerich J, Wrede C et al. Clinical relevance of alarms from patient monitors. Critical Care Medicine 2007; 35(suppl): A178. 12. Fried R, Gather U, Imhoff M & Bauer M. Statistical methods in intensive care online monitoring. Technical Report 33/00, SFB 475. University of Dortmund, 2000. Available from: http://www.sfb475.uni-dortmund.de/berichte/tr33-00.ps. 13. Sukuvaara T, Koski EM, Ma¨kivirta A & Kari A. A knowledge-based alarm system for monitoring cardiac operated patientstechnical construction and evaluation. International Journal of Clinical Monitoring and Computing 1993; 10: 117–126. *14. Imhoff M & Kuhls S. Alarm algorithms in critical care monitoring. Anesthesia & Analgesia 2006; 102: 1525–1537. 15. Siegel AF. Robust regression using repeated medians. Biometrika 1982; 68: 242–244. 16. Davies PL, Fried R & Gather U. Robust signal extraction for on-line monitoring data. Journal of Statistical Planning and Inference 2003; 122: 65–78. 17. Fried R. Robust filtering of time series with trends. Journal of Nonparametric Statistics 2004; 16: 313–328 [Special Issue]. *18. Gather U, Schettlinger K & Fried R. Online signal extraction by robust linear regression. Computational Statistics 2006; 21: 33–51. 19. Bernholt T & Fried R. Computing the update of the repeated median regression line in linear time. Information Processing Letters 2003; 88: 111–117. 20. Charbonnier S, Becq G & Biot L. On-line segmentation algorithm for continuously monitored data in intensive care units. IEEE Transactions on Bio-medical Engineering 2004; 51: 484–492. *21. Charbonnier S & Gentil S. A trend-based alarm system to improve patient monitoring in intensive care units. Control Engineering Practice 2007; 15: 1039–1050. 22. Ma¨kivirta A, Koski E, Kari A & Sukuvaara T. The median filter as a preprocessor for a patient monitor limit alarm system in intensive care. Computer Methods and Programs in Biomedicine 1991; 34: 139–144. 23. Shewhart WA. Economic control of quality manufactured product. Princeton, NJ: D. Van Nostrand Reinhold, 1931. 24. Kennedy RR. A modified Trigg’s tracking variable as an ‘advisory’ alarm during anaesthesia. International Journal of Clinical Monitoring and Computing 1995; 12: 197–204. 25. Trigg DW. Monitoring a forecasting system. Operational Research Quarterly 1964; 15: 271–274. 26. Hill DW & Endresen J. Trend recording and forecasting in intensive care therapy. British Journal of Clinical Equipment 1978: 5–14. *27. Imhoff M, Bauer M, Gather U & Lo¨hlein D. Time series analysis in intensive care medicine. Applied Cardiopulmonary Pathophysiology 1997; 6: 263–281. 28. Smith AFM, West M, Gordon K et al. Monitoring kidney transplant patients. Statistician 1983; 32: 46–54. 29. Trimble IM, West M, Knapp MS et al. Detection of renal allograft rejection by computer. BMJ 1983; 286: 1695–1699. 30. Daumer M & Falk M. Online change point detection (for state space models) using multi-process Kalman filters. Linear Algebra and its Applications 1998; 284: 125–135. 31. Gordon K & Smith AFM. Modeling and monitoring biomedical time series. Journal of the American Statistical Association 1990; 85: 328–337. 32. Yang P, Dumont G & Ansermino JM. Adaptive change detection in heart rate trend monitoring in anesthetized children. IEEE Transactions on Bio-medical Engineering 2006; 53: 2211–2219. 33. Imhoff M, Bauer M, Gather U & Lo¨hlein D. Statistical pattern detection in univariate time series of intensive care on-line monitoring data. Intensive Care Medicine 1998; 24: 1305–1314. 34. Imhoff M, Bauer M, Gather U & Fried R. Pattern detection in intensive care monitoring time series with autoregressive models: influence of the model order. Biometrical Journal 2002; 44: 746–761. 35. Gather U, Bauer M & Fried R. The identification of multiple outliers in online monitoring data. Estadistica 2002; 54: 289–338. 36. Imhoff M & Gather U. Time series analysis for the alarm detection in intensive care monitoring systems. Biomedizinische Technik. Biomedical Engineering 2004; 49(Suppl. 1): 336–337. *37. Gather U, Fried R & Imhoff M. Online classification of states in intensive care. In Gaul W, Opitz O & Schader M (eds.). Data analysis. Berlin Heidelberg: Springer, 2000, pp. 413–428. 38. Schoenberg R, Sands DZ & Safran C. Making ICU alarms meaningful: a comparison of traditional vs. trend-based algorithms. Proceedings/.AMIA Annual Symposium AMIA Symposium 1999: 379–383.

50

M. Imhoff et al. / Best Practice & Research Clinical Anaesthesiology 23 (2009) 39–50

*39. Koski EM, Ma¨kivirta A, Sukuvaara T & Kari A. Development of an expert system for haemodynamic monitoring: computerized symbolization of on-line monitoring data. International Journal of Clinical Monitoring and Computing 1991–92; 8: 289–293. 40. Fried R & Imhoff M. On the online detection of monotonic trends in time series. Biometrical Journal 2004; 46: 90–102. 41. Brillinger DR. Consistent detection of a monotonic trend superposed by a stationary time series. Biometrika 1989; 76: 23–30. 42. Haimowitz I, Le PP & Kohane IS. Clinical monitoring using regression-based trend templates. Artificial Intelligence in Medicine 1995; 7: 473–496. 43. Dahlhaus R. Graphical interaction models for multivariate time series. Metrika 2000; 51: 157–172. 44. Gather U, Fried R, Lanius V & Imhoff M. Online monitoring of high dimensional physiological time series-a case study. Estadistica 2001; 53: 259–298. 45. Imhoff M, Fried R, Gather U & Lanius V. Dimension reduction for physiological variables using graphical modeling. Proceedings of the American Medical Informatics Association Symposium 2003: 313–317. 46. Hanson CW & Marshall BE. Artificial intelligence applications in the intensive care unit. Critical Care Medicine 2001; 29: 427–435. 47. Krol M & Reich DL. Expert systems in anaesthesiology. Drugs of Today 1998; 34: 593–601. 48. Westenskow DR, Orr JA, Simon FH et al. Intelligent alarms reduce anesthesiologist’s response time to critical faults. Anesthesiology 1992; 77: 1074–1079. 49. Koski EM, Sukuvaara T, Ma¨kivirta A & Kari A. A knowledge-based alarm system for monitoring cardiac operated patientsassessment of clinical performance. International Journal of Clinical Monitoring and Computing 1994; 11: 79–83. 50. Morris AH. Computerized protocols and beside decision support. Critical Care Clinics 1999; 15: 523–545. 51. Imhoff M, Gather U & Morik K. Development of decision support algorithms for intensive care medicine: a new approach combining time series analysis and a knowledge base system with learning and revision capabilities. In Burgard, Christaller & Cremers (eds.). KI-99, advances in artificial intelligence, 23rd annual German conference on artificial intelligence. Springer, 1999, pp. 219–230. *52. Morik K, Imhoff M, Brockhausen P et al. Knowledge discovery and knowledge validation in intensive care. Artificial Intelligence in Medicine 2000; 19: 225–249. 53. Vapnik V. Statistical learning theory. New York: Wiley, 1998. 54. Morik K, Wrobel S, Kietz LU & Emde W. Knowledge acquisition and machine learning-theory, methods and applications. London: Academic Press, 1993. 55. Miksch S, Horn W, Popow C & Paky F. Utilizing temporal abstraction for data validation and therapy planning for artificially ventilated newborn infants. Artificial Intelligence in Medicine 1996; 8: 543–576. 56. Mu¨ller B, Hasman A & Blom JA. Evaluation of automatically learned intelligent alarm systems. Computer Methods and Programs in Biomedicine 1997; 54: 209–226. 57. Talmon JL. A multiclass nonparametric partitioning algorithm. Pattern Recognition Letters 1986; 4: 31–38. 58. Ulbricht C, Dorffner G & Lee A. Neural networks for recognizing patterns in cardiotocograms. Artificial Intelligence in Medicine 1998; 12: 271–284. 59. Farrel RM, Orr JA, Kuck K & Westenskow DR. Differential features for a neural network based anesthesia alarm system. Biomedical Sciences Instrumentation 1992; 28: 99–104. 60. Orr JA & Westenskow DR. A breathing circuit alarm system based on neural networks. Journal of Clinical Monitoring 1994; 10: 101–109. 61. Tsien CL. Event discovery in medical time-series data. Proceedings of the American Medical Informatics Association Annual Fall Symposium. Journal of the American Medical Informatics Association 2000: 858–862. 62. Tsien CL. TrendFinder: Automated detection of alarmable trends. MIT Ph.D. dissertation, Massachusetts Institute of Technology, 2000. 63. Zhang Y. Real-time analysis of physiological data and development of alarm algorithms for patient monitoring in the intensive care unit, MIT EECS Master of Engineering Thesis. Massachusetts Institute of Technology 2003. *64. Sieben W & Gather U. Classifying alarms in intensive care–analogy to hypothesis testing. In Bellazzi R, Abu-Hanna A & Hunter J (eds.). Proceedings of the 11th Conference on Artificial Intelligence in Medicine (AIME 07), Lecture Notes in Computer Science (LNCS). Berlin/Heidelberg 2007; vol. 4594/2007. Springer, 2007, pp. 130–138. 65. Breiman L. Random Forests. Machine Learning 2001; 45: 5–32. 66. Sieben W, Siebig S, Wrede C et al. A Random Forrest approach to reduce false alarms in patient monitoring. In: Proceedings ¨ r Biomedizinische Technik (DGBMT) im VDE, DGBMT im der BMT 2007 ¼ 41. Jahrestagung der Deutschen Gesellschaft fu VDE.-52. Berlin: de Gruyter.-Erga¨nzungsband (CD), 2007. 1569047199.pdf. 67. Zadeh LA. Fuzzy sets. Information and Control 1965; 8: 338–353. 68. Bates JHT & Young MP. Applying fuzzy logic to medical decision making in the intensive care unit. American Journal of Respiratory and Critical Care Medicine 2003; 167: 948–952. 69. Becker K, Thull B, Kasmacher-Leidinger H et al. Design and validation of an intelligent patient monitoring and alarm system based on a fuzzy logic process model. Artificial Intelligence in Medicine 1997; 11: 33–53. 70. Lowe A, Jones RW & Harrison MJ. A graphical representation of decision support information in an intelligent anaesthesia monitor. Artificial Intelligence in Medicine 2001; 22: 173–191. 71. Wolf M, Keel M, von Siebenthal K et al. Improved monitoring of preterm infants by Fuzzy Logic. Technology and Health Care 1996; 4: 193–201. 72. Oberli C, Urzua J, Saez C et al. An expert system for monitor alarm integration. Journal of Clinical Monitoring and Computing 1999; 15: 29–35. 73. Zong W, Moody GB & Mark RG. Reduction of false arterial blood pressure alarms using signal quality assessment and relationships between the electrocardiogram and arterial blood pressure. Medical & Biological Engineering & Computing 2004; 42: 698–706. 74. Laursen P. Event detection on patient monitoring data using causal probabilistic networks. Methods of Information in Medicine 1994; 33: 111–115. 75. McIntosh N. Intensive care monitoring: past, present and future. Clinical Medicine 2002; 2: 349–355.