Cardiac arrhythmia classification: A heart-beat interval-Markov chain approach

Cardiac arrhythmia classification: A heart-beat interval-Markov chain approach

COMPUTERS A ND BIOMEDICAL RESEARCH 4, 385-392 (1970) Cardiac Arrh~hmia Classi~cation: A Heart-Beat Interval-Markov Chain Approach * W ILL GERSCH,~...

170KB Sizes 16 Downloads 95 Views

COMPUTERS

A ND

BIOMEDICAL RESEARCH

4, 385-392 (1970)

Cardiac Arrh~hmia Classi~cation: A Heart-Beat Interval-Markov Chain Approach * W ILL GERSCH,~

DAVID M. EDDY,$

AND

EUGENE DONG,

JR.§

Division of Cardiov~~~~r Surgery, Depurrmeur of Surgery, Stanford University Medical Cent@, Stan~Jrd, CaIifomiff 94305 Received March 2, 1970 A sequence of heart-beat intervals (R-R wave intervals) is automatically transformed into a three-symbol Markov chain sequence. For convenience the symbols used may be thought of as S-R-L for short, regular, and long heart-beat intervals, respectively. The probabihty that the observed sequence was generated by each of a set of prototype models characteristic of different cardiac disorders is computed. That prototype corresponding to the largest probability of observed sequence generation is designated as the disorder. This procedure is the equivalent of Kullback’s classification by the minimization of directed divergence procedure. In a p~Iimina~ experiment p~marily using data sequences of 100 heart-beat intervals, 35 different known cases were automatically classified into six cardiac disorders without error. The disorders considered were atrial obviation, APC and VPC, bigeminy, sinus tachycardia with occasional bigeminy. sinus tachycardia, and ventricular tachycardia.

An automatic procedure to classify cardiac arrhythmjas using a Markov chain interpretation of heart-beat interval data is reported. A sequence of heart-beat intervals (R-R wave intervaIs) is automatically transformed into a three-symbol Markov chain sequence.’ For convenience the symbols used may be thought of as S-R-L for short, regular and long heart-beat intervals, respectively. A measure of the probability that the observed sequence was generated by each of a set of prototypic models characteristic of different cardiac disorders is computed. That prototype corresponding to the largest probability of observed sequence genera* The work was supported in part by grant $5 ROl HE 11022-03 SGYA, “Arrhythmia Recognition After Cardiac Surgery,” National Heart Institute, National Institutes of Health. Computations were performed at the ACME facility of the Stanford Unive~ity Medical Center. t Will Gersch is on leave from Purdue University, Center of Applied Storhastics, School of Aero, Astro, and Engineering Science, Lafayette, Indiana. $ David M. Eddy, M.D. is a Postdoctoral Research Fellow, Bay Area Heart Research Committee. $ Eugene Dong, Jr., M.D. is an Established Investigator of the American Heart Association. 1 Accomplished by a computing algorithm that operates on the derivative of the EKG data to select the onset of successive QRS compiexes. 385

386

GEKSCH, EDDY, AND

DONG

tion is designated as the disorder. This procedure is dellloI~strated to be the equivalent of classification by the minimization of directed divergences.‘2 In a preliminary experiment primarily using data sequences of 100 intervals, 35 different known cases were automatically classified into six cardiac disorders without error. The disorders considered were atria1 fibrillation, atria1 premature contractions (APC) and ventricular premature contractions (VPC!, bigeminy, sinus tachycardia with occasional bigeminy, sinus tachycardia, and ventricular tachycardia. Examples of heart-beat interval sequences versus interval number for each of the six different disorders considered are illustrated in Fig. 1. The distinctive patterns of heart-beat interval sequences suggests the possible utility of this data for use in classification of cardiac arrhythmias. In Fig. 2a--f histograms of the number of interval occurrences in each of a 50.10y set heart-beat interval bin is plotted against bin number for each of the cases illustrated in Fig. 1. This evidence suggests that it might be possible to distinguish between cardiac disorders using interval data by comparing the empirical distribution function of an observed heart interval sequence with prototypic distribution functions. This approach is well-known in conventional pattern recognition procedures. In our problem, such an approach, however, suffers potentially from the fact that it does not take into account any information about the highly patterned structure of the heart interval sequence time series, as illustrated in Fig. 1, to accomplish the pattern discrinlination. Our own approach was to transform the heart-beat sequence data into a sequence of symbols (of a three character alphabet S-R-L) in such a way as to preserve some information about the time sequence pattern of the heart-beat intervals, and to do the pattern classification on the transformed data. Prototypic models of each disorder were chosen so that the “histogram” of the collection of prototypic models was reasonably compatible with the histogram of the individual disorders.3 This quality is illustrated in Fig. 2a’-f’. Each observed symbol is identified with a corresponding state, i.e., the occurrence of symbol S is interpreted as an indication that the process has just been in state S. Formally this identifies the heart beat symbol sequence as a Markov information source. i 2 S. KULLBACK, “‘information Theory and Statistics,” Dover, New York, 1968, is a study of logarithmic measures of information and their application to the testing of statistical hypotheses. Directed divergence is a measure of the distance between a sample sequence and population or population mechanism. 3The data were scanned to determine the median interval M. Intervals shorter than A4 - IO .iO ‘Lset were classified as S intervals, intervals longer than M + l5.1O-B set were classified as L intervals. The remainder were classihed regular or as R intervals. 1 R. ASH, “Information Theory,” Interscience Publ. Inc., New York, 1965. A Markov information source is an object that can be characterized by a finite set of states, a set (matrix) of transition probabilities that describe the probability of a transition between states and a mapping which associates an “output” symbol with each state.

CARDIAC ARRH~~IA

CLA~IFICA~ON

387

Fw. I. sequences of heart-beat intervals, h arrhythmjas.

In the paragraphs following the problem is formally restated, the method of classification is explained and illustrated. T HEORETICAL

D ESCRIPTION

A state sequence XI, X2, . . . , X,, of a stationary Markov chain is observed. It is desired to classify the data into a set of mutually exclusive and exhaustive

388

GERSCH, EDDY. AND DONC; 200 600 loo0 vi00 1800

c Q cl00

Q

0) r) L 75 0)

2s Q 500 : 75

5

R states

L

200 600 iQQ0 1000 1800

Intwval50

msac bins

FIG. 2. Histograms and state probabilities.

hypotheses HI, . . . , HP that describe the observed sequence. Corresponding to each hypothesis Hi, i = i, . . . , p there is a prototypic stationary transition matrix Pi.& (The transition matrix is an ordered array of numbers which are the probabil6 The prototypic model PC for each disorder was computed by averaging the empirical transition pro~bility matrices computed for each available sample of the known disorder.

CARDIAC

ARRHYTHMIA

CLASSIFICATION

389

ities of the transition from state j to state k for j, k = 1, . . . , s where s is the number of states.) The problem is: Given the observed sequence XI, Xz, . . . , X, of symbols (or states) corresponding to a Markov chain information source, choose the hypothesis Hi which most closely (in some sense) accounts for the observed sequence. Our solution is to compute the probability that the observed sequence Xl, x2, . . . , X, is produced by the mechanism P’ for i = 1, . . . , p and choose that classification or hypothesis for which this probability is a maximum. Let V,’ equal -(l/n) times the logarithm that the probability of the observed sequence X1, AC,, . . . , X,, was produced by mechanism P. (Selecting the minimum of the negative of the logarithm of a set of probabilities is equivalent to selecting the maximum of the set.) By a straightforward analysis

where Nj, is the number of times that the transition from statej to state k occurred in the observed data sequence, PjA- is the probability of that state transition under hypothesis H, and P(Xr) is the probability of the occurrence of the first state observed in the data sequence, X1, under H,. Njk may be written as Nik = m+Poik(n - I), where rn,O, and P’jk are, respectively, the probability in the observed sequence (relative frequency) of state j, Pjko is the probability of the transition from state j to k, and (n - 1) is the total number of observed state transitions. This notation has two interpretations. First, the observed heart-beat interval sequence (XI, X,, . . . , X,) may be interpreted as having been generated by a mechanism with transition probability PO. Substituting this value of Njk into the expression for V,,’ and taking the limit as n becomes increasingly large lim v,v = J7r = - 2 mj” 2 P,k” log Pjk?. n-m j=l !%=I

(2)

In Eq. 2. with r = 0 we have the known result V” = -C mj” Pj,’ ,k=l

log

PJko = H ( X ) ;

(3)

that is, the negative of the average logarithm of the probability of a Markov chain is H(X), the incremental entropy (uncertainty or amount of information per observation) of the Markov process. 4 The well-known information theory result. TXpi log pi 5 -25‘p i og 1 qi, applied to Eq. (2) yields H(X) = V” 5 Vr, r = I, -, . . . p. Therefore, the minimum achievable value of V” is H(X). Secondly, Njk = mj” Pjk”(n - 1) can also be interpreted as the average (expectation) of Njh under hypothesis Ho, the hypothesis that the sequence was generated with transition probability mechanism P O, the matrix of transition probabilities in the observed sequence. Under that interpretation, the quantity, Z(o:r) = V - V”,

390

G E R S C H . EDDY, AND DONG

is formally the directed divergence between the “mechanism” P” that generated the observed sequence and Pp. In reference 2 Kullback discusses classification by selection of hypothesis H, where I’ minimizes l(o:j);,j = I, 2, . , p. I(o:j) can be interpreted as the average information (per observation) for discrimination against hypothesis j. This classification procedure assigns the observed sequence to that generating mechanism which it most closely resembles. A PPLICATION

The potential capabilities of the classification procedure is suggested in Tables I and 2. Table 1 lists K(j. k) a “modified” directed divergence K(j,k) = I(j, k) + H,(X) for j, k = 1, . . , 6. This quantity is the sum of the directed divergence between hypothesis j and k and the entropy of the ,j-th prototype, computed from the prototypic models P” i = 1, . , 6. Since ZCj:j) = 0, the minimum value of table entries is Hj(X) and this appears on the table matrix diagonal. For each row j, the entries are a measure of the amount of information per observation (the natural logarithm is used so the information is measured in “nits” )against each hypothesis Hkr k = 1, . . , 6 being true. Note that K( j, k) = V - Vi and that because H,(X) = P’, K(j, k) the modified directed divergence is Vk a measure ofthe negative of the logarithm of the probability that the,j-th prototype was generated by the k-th model. Pairwise comparison of scatter diagrans and histograms of Figs. 1 and 2 reveal that the patterns which most closely resemble each other are also relatively close in Table 1. Note that the relatively low values of K(5:5) and K(6:6), low uncertainty or little information per observation, are compatible with the concentrated histograms of Fig. 2e and J The relative closeness of K(5:6) to K(5:5) and K(6:5) to KC6:6) suggest that hypotheses H, and H6 may be more readily confused than any other hypotheses. Since He. corresponding to ventricular tachycardia, is an indication of an imminent fatality we have adapted a two-stage classification procedure to enhance the distinction between Hs and Hg. If an observed sequence is first classified in either H:, or HE, we recompute on the observed sequence under new prototypic models which sharpen the difference between these disorders.” Table 2 indicates the values of the directed divergence for this second stage of discrimination. Finally, to illustrate the specific workings of the classification procedure the formula K( *, r) = - 2 mj* ,$ Pjk * log P’jky j=l

* = i. ii, . . . , vi, r = 1, . . . , 6,

(4)

6 The reclassification into the states S, R, and L mirrored the scheme used initially. Again with the median interval designated by the letter M, intervals shorter than M - 2.1O-2 set were classified as S intervals, intervals longer than M + 3.1O-2 set were classified as L intervals and the remainder were classified as R intervals.

CARDIAC ARRHYT~IA

CLARIFICATION

391

TABLE I M ODIFIED DIRECTED DIVERGENCES COMPUTED Kfl: ) K(3: ) K(4: ) K(5: )

1.052 1.095 1.255 1.313 0.201

K(6: )

0.494

K(2: )

ON

P ROTOTYPIC M ODELS

1.717

2.378

4.498

5.532

4.141

0.707 0.753 0.715 0.150

1.445 0.389 0.215 0.557

2.927 0.483 0.183

5.371 6.792 6.907

4.384 6.762 6.903

0.380

1.386

1.878 4.600

0.004 0.078

0.006 0.036

is used to compute the data of Table 3. The value of * denotes the particular sequence to be classified, * = i, ii, . . . . vi corresponds in order to the six examples illustrated in Figs. 1 and 2. The value of K(*, r) is the modified directed divergence computed using the *-th set of empirical data and the r-th prototypic model (note that FL{*, r) = V), The data sequence in * is classified as an example of prototype Pi’ or equivalently as an indication that hypothesis HI< is true if the minimum of K(*,Y),r = 1,. . . ,6is K(*, u). The minimum value of the modified directed divergence appears along the matrix diagonal, indicating that the samples are each correctly classified. An indication of the extent of how typical each sample was of the prototypic disorder is available by comparing the matrix diagonal entry with H,(X). Transforming the sequence of heart-beat intervals into a sequence of Markov chain symbols discards some of the information contained in the sequence of numerical values in the heart-beat intervals for the advantage of simplifying the representation of that sequence and allowing for a simple pattern recognition procedure. Each prototypic cardiac disorder is represented by the nine numbers of the prototypic transition matrix. The classification of some data set * is accomplished by computing Eq. 4 for each different prototypic matrix P, r = 1, 2, . . . and selecting that value of Y for which KC*, r) is a minimum. The computations are so simple that they can be performed in a few milliseconds on a small digital computer. This fact can permit on-line (between the heart-beat interval) automatic arrhythmia classification suitable for cardiac postoperative attention. An explanatory note is in order. A total of 35 cases were available for this test. The prototypic mode1 of a disorder (a transition matrix) was computed as the average of the empirical transition matrices computed for each available sample of that disorder. The same cases were subsequently used to test the classification TABLE 2 M ODIFIED DIRECTED DIVERGENCES COMPUTED --

K(5: ) K(6: )

K( :5)

ON

P ROTOTYPIC M ODEL PS

.016

I(( :6) ~.--~..I48

1.170

,502

AND

P6

392

GERSC‘H.

_....._, - - -.-.~

~.

..“.

KI

:I)

Kl

EDI)\‘, A N D

:2i

Kii: &ii: K(iii: K(iv:

I i 1 )

1.184 I.164 1.180

K(v:

1

0.737

7.040 0.781 1.055 I.349 0.548

K(vi: I

0.743

0.575

1.140

I)ONG

K( :31

K( :4)

KI

:SI

Kc :6)

HIX,

2.248 1.120 0.526 0.623 2.048 2.090

3.669 2.134 6.275 0.567 6.908 6.908

6.434 5.941 6. -ml 6.909 0.003 0.142

5.597 5.300 6.702 6.908 0.01 I 0.057

I .081 0.740 O.-i30 0.315 0.002 0.054

and error free classifi&ation performance was observed. Therefore, we demonstrated that with respect to our classification procedure, each of the available examples of a particular disorder were sufficiently like the other available examples of that disorder and sufficiently unlike any other disorder to preserve the original partitioning of 35 original cases into six disjoint categories. The results obtained, while highly encouraging must be considered as only a preiinlinary suggestion of our classification procedure capabilities. The authors express their appreciation to Dr. Alfred Spivack for providing the electrocardiographic data. In particuiar, Will Gersch thanks Professor T. Cover of Stanford University, Department of Statistics and Electrical Engineering. for his enlightening discussions pertinent to the material presented in this article.