Medical Image Analysis 14 (2010) 582–593
Contents lists available at ScienceDirect
Medical Image Analysis journal homepage: www.elsevier.com/locate/media
Model driven quantification of left ventricular function from sparse single-beat 3D echocardiography Meng Ma a, Marijn van Stralen b, Johan H.C. Reiber a, Johan G. Bosch b, Boudewijn P.F. Lelieveldt a,c,* a
Div. of Image Processing, Dept. of Radiology, Leiden University Medical Center, Leiden, The Netherlands Thoraxcenter, Erasmus MC, Rotterdam, The Netherlands c Department of Mediamatics, Delft University of Technology, Delft, The Netherlands b
a r t i c l e
i n f o
Article history: Received 18 July 2008 Received in revised form 14 April 2010 Accepted 23 April 2010 Available online 6 May 2010 Keywords: Segmentation 3D Ultrasound Cardiovascular Left ventricle quantification
a b s t r a c t This paper presents a novel model based segmentation technique for quantification of left ventricular (LV) function from sparse single-beat 3D echocardiographic data acquired with a fast rotating ultrasound (FRU) transducer. This transducer captures cardiac anatomy in a sparse set of radially sampled, curved cross-sections within a single cardiac cycle. The method employs a 3D Active Shape Model of the left ventricle (LV) in combination with local appearance models as prior knowledge to steer the segmentation. A set of local appearance patches generate the model update points for fitting the model to the LV in the curved FRU cross-sections. Updates are then propagated over the dense 3D model mesh to overcome correspondence problems due to the data sparsity, whereas the 3D Active Shape Model serves to retain the plausibility of the generated shape. Leave-one-out cross-validation was carried out on single-beat FRU data from 28 patients suffering from various cardiac pathologies. Detection succeeded in 24 cases, and failed in 4 cases due to large dropouts in echo signal. For the successful 24 cases, detection yielded Point to Point errors of 3.1 ± 1.1 mm, Point to Surface errors of 1.7 ± 0.9 mm and an EF error of 7.3 ± 4.9%. Comparison of fitting on single-beat versus denser multi-beat data showed a similar performance for both types of data irrespective of frame angles of the intersections. Robustness tests with respect to different model initializations showed acceptable performance for initial positions within a range of 26 mm for displacement and 12° for orientation. Furthermore, a comparison study between the proposed method and global LV function measured from MR studies of the same patients showed an underestimation of volumes estimated from echocardiographic data compared to MR derived volumes, similar to other results reported in literature. All experiments demonstrate that the proposed method combines robustness with respect to initialization with an acceptable accuracy, while using sparse single-beat FRU data. Ó 2010 Elsevier B.V. All rights reserved.
1. Introduction In the past decades, echocardiography has evolved into an important clinical modality for detection of cardiac pathologies. With the advent of matrix transducers, 3D echocardiography has become available, and has drawn much attention from clinical practice. This is due to its non-invasive, patient friendly acquisition and its technical capability of generating high temporal resolution images with favorable viewing angles at low costs. With current matrix transducers, 3D cardiac anatomy can be imaged with high spatial resolution, and by merging multiple consecutive cardiac cycles a sufficiently high temporal resolution can be accomplished in near real-time. Typically, a minimum of four consecutive cardiac
* Corresponding author. Contact address: Div. of Image Processing, Dept. of Radiology, Leiden University Medical Center, Leiden, The Netherlands. Tel.: +31 715261117; fax: +31 715266801. E-mail address:
[email protected] (B.P.F. Lelieveldt). 1361-8415/$ - see front matter Ó 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.media.2010.04.004
cycles need to be merged to cover the left ventricle entirely over the full cardiac cycle. Apart from matrix transducer imaging, fast rotating ultrasound (FRU) shows some unique characteristics that lead to faster image acquisition with higher in-plane resolution. The FRU transducer (Fig. 1a) was introduced by Djoa et al. (2000) and optimized by Voormolen et al. (2006). It consists of a conventional linear phased array for 2D imaging, which is rotated internally at 240–480 rpm and acquires 2D apical images at about 100 frames per second. Each individual frame contains about 80–100 ultrasound lines within a fan-shaped scan sector. The acquired 1D image lines of 400–500 pixels (resolution 0.3 mm/pixel) are interpolated by a polar scan conversion to 2D image frames of 600 400 pixels, which were then sorted according to cardiac phase based on the simultaneously recorded ECG signals. By combining the high spatial and temporal resolution of 2D echocardiography with the high rotation speed, a dynamic 3D volume can be covered entirely in short time. A distinct feature of FRU
M. Ma et al. / Medical Image Analysis 14 (2010) 582–593
is that it does not require multiple cardiac cycles to image the 3D left ventricular geometry entirely, as most matrix array transducers do. Instead, the FRU transducer is able to generate 4–8 (depending on the heart rate) 2D images for each of 16 cardiac phases within a single cardiac cycle, allowing accurate quantification of LV volume over the full cycle (Voormolen et al., 2007). This yields a higher planar resolution, compared to 3D matrix transducers. Due to the high rotation speed, the FRU acquires curved crosssections as shown in Fig. 1b. Although nowadays 3D echocardiography devices usually provide excellent visualization and navigation tools to help the physicians to fully observe the 3D images, quantitative measurements of the cardiac function (wall thickening/thinning, wall motion pattern, LV blood pool volume and ejection fraction (EF)) are necessary for accurate diagnosis and disease staging. To perform these measurements, accurate segmentation of the LV is required, preferably generated automatically to decrease the manual contouring workload and to reduce observer variability. As one of the driving application areas of medical ultrasound, 2D echocardiography has always been an active research field for automatic segmentation and tracking of the left ventricle and therefore the literature on these methods is quite extensive as listed in Hammoude (1998) and Bosch (2008). Attention has been mostly given to tracking of the motion of endocardium to estimate the left ventricular volumes and henceforth the ejection fraction, and to estimate regional wall motion for the diagnosis and assessment of ischemic heart disease. Early work focused on 2D acquisitions, and implicitly assumed that the principal component of motion is in the plane of acquisition slice. Also, the quality of the segmentation heavily relies on the quality of the data, which varies depending on the view due to the anisotropy of ultrasound image acquisition, and due to artifacts such as shadowing from the ribs and attenuation. Also for 3D echocardiography, much work has been published recently on segmentation and quantification methods for volumetric 3D echocardiographic data as surveyed in Noble and Boukerroui (2006). For densely sampled 3D echocardiographic volume data, early work on 3D endocardium detection by Coppini and Poli (1995) focused on the segmentation and reconstruction of the left ventricle from freehand ultrasound data where data was acquired at four fixed angles with respect to the four-chamber view. In that work, the boundary was detected using a Laplacian of Gaussian edge operator and a trained neural network, followed by the fitting of an elastic surface model to the edges. Angelini et al. (2001) considered segmentation of 3D echocardiography data from a Volumetrics system (Smith and Von Ramm, 1991). They used a 2D balloon deformable model driven by intensity gradient based force to fit to the data which was denoised by a wavelet analysis. Song et al. (2002) developed a segmentation method that fit the 3D model surface to the image data with greatest posterior probability in a Bayesian framework. Corsi et al. (2002) described a level set method which was used on RT3D echocardiography from a Volu-
583
metrics (Smith and Von Ramm, 1991) while Sarti et al. (2002) presented a different level set method that is capable of tackling missing boundaries. This work was later extended by Mikula et al. (2004) and Corsaro et al. (2006). Hansegard et al. (2007a) proposed a real-time solution using Active Shape Models. The method is based on a sequential state estimation method using an extended Kalman filter to predict and update ASM parameters. The tracking algorithm was extended from their prior work by Orderud (2006). As for sparsely sampled data such as acquired with the FRU transducer, the typical solution is to reconstruct the sparse data to volumetric data as Blancher et al. (2004). Montagnat et al. (2003) developed a 3D+T model based segmentation approach for rotational 3D echocardiography. The method employed a deformable model to fit to the filtered data where the external forces are provided by a 3D gradient operator. With respect to sparse FRU data, Bosch et al. (2006) has proposed a voxel interpolation method to reconstruct a dense 3D volume from cross-sections in multiple heart beats. Recently, van Stralen et al. (2005) described a semi-automatic border detection solution performed directly on FRU cross-sections. Hansegard et al. (2007b) proposed the use of coupled, piece-wise 2D constrained Active Appearance Models (AAM) to segment echocardiographic data; a similar approach has been proposed by Üzümcü et al. (2005), who developed coupled-view Active Appearance Models for combined segmentation of cardiac MR short- and long axis views. For this same purpose, van Assen et al. presented a 3D Active Shape Model for use in sparsely sampled cardiac MR data that uses a fuzzy C-means clustering algorithm on the edge profiles to classify intensities in MR as either blood, myocardium or air; the transitions between tissues then serve as update points for fitting an Active Shape Model. In the abovementioned studies on echocardiographic data, it is noticed that current methods designed for sparsely sampled data can be categorized into two types: (1) segmentation on reconstructed dense (volumetric) data; such data is usually compounded over multiple cardiac cycles and (2) segmentation on individual 2D slices gathered from multiple cardiac cycles, and compounded into a 3D contour mesh. However, automatic 3D quantification methods that work on sparsely sampled 3D, single-beat FRU data have not been reported yet. The main goal of this work is the development of a model driven method to segment the 3D LV endocardial boundary from a single heartbeat based on sparse, FRU curved cross-sections. As a result, quantification of cardiac function can be performed without necessarily compounding images from multiple heartbeats to reconstruct a dense 3D volume. The proposed method was adapted from the work of van Assen et al. (2006), who described a 3D Sparse Data Active Shape Model (SPASM) for cardiac MR segmentation. In MR however, clear intensity differences between the tissue classes enable a model update mechanism based on classification of tissue intensities, which as such is not sufficiently
Fig. 1. (a) The fast rotating ultrasound transducer. (b) The acquired curved images.
584
M. Ma et al. / Medical Image Analysis 14 (2010) 582–593
robust for application in echocardiographic data due to its noise characteristics, dropouts and less prominent separation of tissue intensities. To enable robust application in sparse, curved echocardiographic data, we present two methodological innovations: a novel model update generation scheme based on dynamic programming with locally selective appearance models assigned to the 3D model, and adjustment to sparsely sampled curved crosssections. To the best of our knowledge, this also represents the first 3D quantification method of global cardiac function from data from a single heartbeat. A pilot study on some aspects of the methods in this paper was reported in Ma et al. (2009). 2. Methods 2.1. Method overview Statistical shape models have demonstrated to be a powerful tool for segmentation of different medical image analysis tasks, both in 2D and in 3D. In this study, to accomplish single-beat quantification in a rotating probe, the amount of data is limited and the images are acquired from curved cross-sections. Therefore, apart from constraining the shape to statistically realistic variations, the model also serves as a population based shape reconstruction in areas of data sparsity. In this study, we construct shape models and local intensity models from a training set as shape and intensity priors. The matching of the LV model is an iterative process driven by update points detected in the 2D images using pretrained locally selective appearance patches, but regularized by shape constraints imposed from the 3D shape model, as shown in Fig. 2. First, the average shape model is initially positioned in the image data, and the cross-sections between the model and the curved image planes are computed. In each such plane, the cross-section (contour) is used to define a search space for candidate LV endocardial border points. These are detected by maximizing the similarity between the intensity information of the candidate position and the corresponding set of trained intensity patches. In addition, dynamic programming is employed to constrain the spatial variation of individual landmarks along the 2D contour. These updated contour points are then mapped back to 3D space and used as sources to propagate the update throughout
the dense model to fill void regions of the model surface that are not covered by the FRU cross-sections. Subsequently, the 3D Active Shape Model (ASM) is used to constrain the model to statistically plausible shapes. 2.2. Model construction From an expert annotated subset of 2D contours in the FRU cross-sections, a full set of planar contours is generated using the semi-automatic method of van Stralen et al. (2005) with manual adjustments to achieve complete expert acceptance. This contour set is used as training data and considered the ground truth for the evaluations. Based on these planar contours, 3D LV meshes of different cardiac phases are generated as described in van Stralen et al. (2007). Point correspondence is achieved by sampling each training sample in an LV specific coordinate frame shown as Fig. 3. First, the mitral valve plane (MVP) and the left ventricular long axis (LAX) are reconstructed by mapping the contours from the 2D curved planes into the 3D coordinate space. The MVP is defined as a best-fit plane through the mitral valve points; the LAX is defined as the straight line through the center of the MVP and the center of gravity (CoG) of the apical region of the LV. The azimuthal direction is defined by the position of the center of the aortic valve. Then, the mesh points are determined from the interpolated LV surface by a regular spherical sampling for the apical region (top 25% of LAX) and a regular cylindrical sampling of the non-apical LV section. The LV shape model is constructed using Active Shape Modeling, as introduced by Cootes et al. (1992, 1995). With the point correspondence being established from the previous step, shapes are parameterized as shape vectors and pose and scale differences were eliminated using Procrustes alignment. After calculating the mean shape and shape covariance matrix as described in Cootes et al. (1992, 1995), a principal component analysis (PCA) is applied to compute the eigenvectors and eigenvalues of the training set. Depending on the amount of variation in the training set represented by the model, the eigenvectors /i corresponding to the t largest eigenvalues are retained, represented by matrix U ¼ ð/1 j/2 j . . . j/t Þ. From this matrix, a training shape X can be approximated by
Fig. 2. Overview of the iterative model matching. (a) The matching is initialized with the average model. (b) The model mesh is intersected with the curved cross-sections. (c) Updates are generated on the planar images using the pre-trained local intensity patches and dynamic programming. (d) Planar updates are mapped to 3D space and propagated throughout the dense model. (e) The model is constrained to a statistically plausible shape and then used as the initial model for the next iteration.
585
M. Ma et al. / Medical Image Analysis 14 (2010) 582–593
Fig. 3. (a) 3D LV shape construction from planar contours. (b) LV surface sampling to achieve point correspondence. (Adapted from van Stralen et al. (2007).)
þ Ub; XX
ð1Þ
with b a t-dimensional vector containing the model parameters
b ¼ UT ðX XÞ:
ð2Þ
In this study, separate shape models for end-diastolic (ED) and end-systolic (ES) phases are built to keep shape variability due to cardiac contraction separated from inter-individual variations. Apart from the shape model, local intensity models are constructed to generate update points during the model matching. To accomplish this, the 3D model surface is partitioned into a fixed number of regions. Landmarks in each region presumably share similar intensity characteristics and gradient direction. Therefore, 2D image patches are sampled around contours that cross the same 3D surface region. These patches are collected over the whole training set for each sector, and a local average appearance patch is computed along with the covariance for each surface section, as shown in Fig. 4. The LV surface segments division for intensity patch training/matching was configured empirically to capture the heterogeneity in intensities over the surface in a limited number of segments. 2.3. Model matching From the initial model position, the intersection between the 3D model and each individual curved image cross-section is calculated to form a point set for each cross-section. This point set is then re-arranged to a continuous contour with evenly distributed landmarks. For each landmark on this contour, a scanline perpendicular to the contour defines the search range. The trained image patch in the corresponding 3D surface partition is then compared to the 2D FRU patches, which are sampled for all the points on the scanlines. Fig. 5 illustrates the pre-trained local appearance patches for all the model surface regions and the scanline patches to be collected on candidate locations. Given the trained mean patch and the covariance of the training set, the Mahalanobis distances between the intensity model patch and scanline patch are computed and stored in a cost matrix.
Fig. 4. Local average patches are trained for each model surface region.
Using dynamic programming (DP) proposed by Dijkstra (1959), an optimal continuous path (contour) through this cost matrix is calculated, which is subsequently mapped back to the 3D space as a set of candidate points for the 3D ASM. The cost function is defined as follows:
TðpÞ ¼
R X r¼1
M r;kðrÞ þ
R1 X
C r;kðrÞ
ð3Þ
r¼2
where T(p) is the total cost of a path p, r is the scanline number, k the position on the scanline, Mr,k is the Mahalanobis distance in each node between the intensity model and the target image patch and Cr,k is the sidestep cost which is defined as
C r;k ¼ wjkðrÞ kðr1Þ j
M max M min K 1
ð4Þ
where w is the weight factor, Mmax and Mmin are the maximum and minimum Mahalanobis distance in the current scanline and K is the number of candidate positions on each individual scanline.
586
M. Ma et al. / Medical Image Analysis 14 (2010) 582–593
Fig. 5. Small image patches are collected from multiple candidate positions along the scan line perpendicular to the contour. The Mahalanobis distances to the corresponding local average patch are stored in a cost matrix.
The update points on the model surface are propagated over the entire model surface to fill in regions without updates using a Gaussian propagation mechanism, as proposed in van Assen et al. (2006). Each source is convolved with a Gaussian of width rp , where the magnitude of an update from a single source can be defined as
-ðk; xÞ ¼
8 <
kkxk2 2rp2
if kk xk 6 v : 0 if kk xk > v e
ð5Þ
where -ðk; xÞ is the update magnitude for the receiving node k, x is the source node, kk xk is the geodesic distance between these nodes and v is a threshold to stop propagation and was set to 3rp . In case the receiving node falls in the updating range of multiple sources, the update is the weighted average of the source nodes where the weighting factor is dependent of the geodesic distance from the receiving node to the source nodes.
-ðkÞ ¼
n P i¼1
e
kkxi k
-ðk; xi Þ ; if kk xi k 6 v
ð6Þ
where e is a normalization constant that was set so that the sum of all the weighting factors e/(||k xi||) is equal to 1. To overcome the considerable pose and shape differences between the mean model and the target shape, especially in the start of the matching, the update procedure contains two stages in which the mechanism to determine the direction of the update vectors differs. Initially, if the receiving node is only influenced by a single source node, the direction of this receiving node is set to be parallel to the source node. In case of multiple source nodes, the direction of the receiving node is set to be parallel to the direction of the vector sum of the source nodes. This accommodates for large updates in position and orientation in the first few iterations. In the beginning of each iteration step, an evaluation that calculates the average node displacement is performed. Once the value drops below a threshold d, the update procedure enters to a refinement stage (second stage). In this stage, the receiving nodes always update along the direction perpendicular to the model surface. This better accommodates for shape adaption of the model to the target
image data. The actual direction of the source node(s) does not affect the update of the receiving node, while the magnitude of receiving node still follows the same mechanism as Eq. (5). The model update point configuration does not necessarily maintain the shape of the LV wall. Therefore, the parameters of the shape model are constrained to a range to ensure that the generated shapes are statistically plausible. In this study, the shapes are constrained to ±3 standard deviations along each principal axis. In each iterative step, the generated statistically plausible shape serves as input for the next iteration. 3. Experimental setup 3.1. Data Apical FRU sequences were acquired from 28 patients suffering from different degrees of myocardial infarction and ischemic cardiomyopathy. Typically, a patient is scanned by FRU for several cardiac cycles with simultaneously recorded ECG signals. In this study, for each patient, a subset of the FRU sequence was selected which contains only a single cardiac cycle that starts from the ED phase to the next ED phase. To measure the clinical parameters such as LV volume of ED and ES phase and consequently the EF, the segmentation of ED and ES is of importance. The whole cardiac cycle was divided into 16 cardiac phases. Using the simultaneously recorded ECG signals, image frames at ED were selected. The set of ES frames was defined as the group containing the frame before the mitral valve opens, and this frame was selected by visual inspection. Based on the parameter settings of the FRU transducer, the transformation from 2D image slice to 3D coordinates was accomplished. Then the image frames from ED and ES were placed to their original locations in 3D coordinates respectively as shown in Fig. 6. Successive segmentation and validation experiments were performed on these sparsely sampled 3D slices collected from a single cardiac cycle. As a comparison study, cardiac MR sequences were acquired for 10 of the 28 patients. To determine the volumes of the LV in these MR data, a semi-automatic segmentation tool (MRI-MASS (van der Geest et al., 1997)) was used.
M. Ma et al. / Medical Image Analysis 14 (2010) 582–593
587
Fig. 6. Single-beat sparse data on a fast rotating ultrasound sequence. The left part of the figure shows a sequence of images that starts from the ED phase to the ES phase within the same cardiac cycle. The whole cardiac cycle was divided into 16 phases. According to the simultaneously recorded ECG signals, image frames from ED and ES are selected. Based on the rotation angle information of the FRU transducer, the transformation from 2D image slice to 3D coordinates is accomplished. Then the image frames from ED and ES are placed to their original locations in 3D coordinates, respectively, as shown in the right part of the figure.
3.2. Validation criteria To test the reliability and robustness of the proposed segmentation approach, we performed a leave-one-out cross-validation on single-beat FRU data from 28 patients. During validation, images of the ED and ES phases within a single cardiac cycle were selected according to the simultaneously recorded ECG signals. ASM and propagation parameters were adopted from Ordas et al. (2005), who performed a grid-based optimization approach on a set of parameters. Image patches were collected from 27 patients. Per patient, 100 image patches for each sector on the model surface were used to train the local appearance patch models. Patch sizes of 45 45 were used for the training of the intensity templates. The patches were then sub-sampled to reduce noise and were also normalized to average ± standard deviation of 0 ± 1 to minimize global inter-patient intensity difference. Table 1 lists several important parameters involved in the experiments. Segmentation results were compared to semi-manually generated segmentations which serve as ground truth, van Stralen et al. (2005). These semimanual contours were generated by two independent observers
Table 1 Parameters used in the experiments. Shape modeling Number of vertices of the 3D model mesh Number of landmarks on a single 2D contour
901 39
Intensity patch generation Original patch size Sub-sampled patch size Number of levels on LV surface Number of segments on each level Total number of segments
45 45 99 5 (including apex) 6 25
ASM Allowed variation per mode Propagation Gaussian kernel width Refinement stage threshold d
±3SD 8 mm 4 mm
and afterwards manually corrected. With respect to LV volume, the ground truth contours exhibit a good inter-observer correlation of 0.943 as well as a good correlation with MR data of 0.936, as reported in van Stralen et al. (2005). To study the segmentation accuracy, Point to Surface (P2S) and Point to Point (P2P) error were compared to the ground truth delineation while volume and EF errors against ground truth were measured to evaluate the overall performance. The model was first initialized under ideal conditions, using the center of gravity and orientation of the manually drawn contour mesh, whereas the model shape was initialized as the average shape. The resulting segmentation errors were then compared to the errors in the initial model state to reveal the significance of the contour movement and the improvement towards correct locations. To detect a possible non-symmetrical distribution of the validation results due to outliers, the median of each measured index was also provided in addition to the mean and standard deviation. To evaluate the efficiency of single-beat detection, we also performed a comparison of segmentation results from application of the method to the first and the second cardiac cycle to see if the results were consistent when using different cross-sectional frames. Furthermore, the results were also compared to the segmentation results from using image frames from two consecutive cardiac cycles to test whether combining multiple frames from different cycles for the same cardiac phase would increase the accuracy of the segmentation process. To examine the method’s sensitivity with respect to initialization of the model, we carried out convergence tests that compare the final results of the method under different initial conditions with the results from the optimal initialization. To accomplish that, a range of model pose initializations was tested. The initial model was intentionally misplaced along horizontal, vertical and rotational directions to test the tolerance to initialization errors. To test the robustness with respect to shift errors on the 2D image slice, the model was misplaced at multiple initial starting points on each slice and the final performance was monitored. As for the robustness with respect to orientation errors, the initial model was
588
M. Ma et al. / Medical Image Analysis 14 (2010) 582–593
rotated around the long-axis and the final P2P errors were collected. We have limited ourselves to the P2P error because it represents a more strict (worst-case) accuracy than P2S errors for evaluating convergence behavior in the presence of errors in model orientation. To measure the accuracy under less favorable initialization, the monitored final P2P errors were compared to the errors from optimal initialization. Those cases where the monitored errors were less than 150% of the errors from optimal situations were considered successful segmentations. Finally, a subset of the data was compared to cardiac MR studies of the same patients. Ten out of 28 patients also underwent cardiac MR scan and the LV volumes on these patients were determined using a semi-automatic segmentation tool (MRI-MASS) by an independent observer. The results of these MR scans were compared to the FRU ground truth segmentation and the detected segmentation results using the method proposed in this study. Paired t-tests were performed to assess the statistical significance of differences between the MR results and FRU ground truth, between the FRU ground truth and the automated segmentation, and between MR results and the FRU automated segmentation.
4. Results 4.1. Results with favorable initialization First, the validation was performed under ideal initialization conditions: the CoG and the orientation of the initial model were
aligned to the ground truth shape while the model shape was initialized using the average shape. Fig. 7 provides examples of detection results in the curved 2D cross-sections and Fig. 8 shows the 3D segmentation results, with the point-to-point error color-coded on the surface. Fig. 9 shows the P2S (average of ED and ES) error comparison between before the segmentation (initialization errors) and after the segmentation process, where a significant improvement of segmentation results was observed due to the model fitting. Fig. 10 shows the histograms of the segmentation error distribution. Visual inspection of the segmentation results revealed four obvious matching failures due to severe acoustic dropout in a large part of LV wall as shown in two examples in Fig. 11. These four studies were excluded from quantitative analysis of border positioning and volume errors. Table 2 shows the overall segmentation results over the 24 successful matches.
4.2. Single-beat, second beat and two-beat comparisons Table 3 shows the comparison results of average error of ED and ES from first beat, second beat and two-beats. Both in the second beat validation and multi-beat validation, detection failures for the same four cases were observed and were excluded in this comparison. Using a different set of cross-sectional slices from the second cardiac cycle yielded a similar performance as for the first cycle; paired samples t-tests revealed no statistically significant differences (P > 0.05) in comparing the border positioning errors (both P2P and P2S) between matching results on beat 1 and beat 2, and for matching results based on pooled
Fig. 7. Examples of automatically derived 2D contours (intersections of the curved cross-section and the 3D model) from four different patients.
Fig. 8. Examples of 3D segmentations with P2P errors color-coded on the surface (collected from six different patients). The brightness indicates the magnitude of the P2P errors.
M. Ma et al. / Medical Image Analysis 14 (2010) 582–593
589
Fig. 9. Error comparison (before model matching (in red) and after model matching (in blue)). The horizontal axis represents the patients and the vertical axis is the measured error against ground truth segmentation. The upper graph is the P2P error and the lower one is the P2S error. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 10. Error distribution histograms for N = 28 patients. Left: P2P error, right: P2S errors.
image data. The unsigned EF difference between beat 1 and beat 2 is 0.037 ± 0.023, which is much smaller than the error of automatic versus manual contours, indicating robustness with respect to frame selection.
4.3. Convergence tests Fig. 12 shows an example of the initial positions and indicates the convergence envelope that leads to an acceptable range of error.
Fig. 11. Two examples of detection failure. Due to dropout of a large area (almost a complete side of the myocardium is missing), the segmentation algorithm was unable to find the proper position of the wall. (Ground truth contours in red, detected contours in yellow.) (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
590
M. Ma et al. / Medical Image Analysis 14 (2010) 582–593
Table 2 Cross-validation results against ground truth segmentation, after exclusion of four matching failures (N = 24). Avg. ± St. Dev.
P2P (mm) P2S (mm) Volume (%) EF (%)
ED
ES
3.07 ± 1.12 1.67 ± 0.94 6.13 ± 3.22 7.25 ± 4.88
3.14 ± 1.09 1.72 ± 0.95 6.65 ± 3.82
tion failures) by keeping the displacement along one axis constant and varying along the second axis. Here the step size of the displacement is 3 mm for both horizontal and vertical directions. Fig. 12d plots the P2P errors of the ED phase under different initializations with orientation interval of 2°. These figures indicate that within a range (highlighted with dashed arrows) of 26 mm of single horizontal and vertical displacement and 12° around the optimal initial position, the model still converges to similar P2P errors. 4.4. Comparison to cardiac MR segmentation
Table 3 Single-beat, second beat and multi-beat (2 beats) comparison (N = 24). P2P, P2S and volume errors are pooled over ED and ES phases.
First beat Second beat 2 beats
P2P error (mm)
P2S error (mm)
Volume (%)
EF (%)
3.1 ± 1.1 3.0 ± 1.2 2.9 ± 1.0
1.7 ± 1.2 1.7 ± 1.0 1.6 ± 0.9
6.4 ± 3.5 6.5 ± 3.1 6.2 ± 3.2
7.3 ± 4.9 7.4 ± 4.6 7.1 ± 4.4
Numerical results show that, on average, the overall performance is still acceptable (P2P and P2S error increase was less than 50%) when the model is initialized within the distance of 13 mm from the correct location. Fig. 12a–c plot the final P2P error (exclusive of detec-
For 10 patients, we acquired cardiac MR scans and the LV volumes were determined using semi-automatic tools. These volumes were compared to our automatic segmentation results, shown in Table 4. Table 5 gives the results of t-tests comparing differences between MR and FRU (ground truth) volumes, the FRU (ground truth) and FRU (detected) volumes, and the MR and FRU (detected) volumes. 5. Discussion In this paper, a segmentation method that combines Active Shape Models and local appearance models is presented that is
Fig. 12. Robustness tests w.r.t. model initialization for 24 patients. (a) Dots indicate the tested locations of CoGs for the initial model. The high-lighted dots represent the locations that give acceptable performance (P2P and P2S errors are no greater than 150% of the errors with favorable initialization). (b) The P2P errors for varying horizontal displacements. (c) The P2P errors for varying vertical displacements. (d) The P2P errors for varying orientation displacements in degrees.
591
M. Ma et al. / Medical Image Analysis 14 (2010) 582–593 Table 4 Volume and EF comparison to cardiac MR. Patients
1 2 3 4 5 6 7 8 9 10
MR
FRU (ground truth)
FRU (detected)
ED vol. (ml)
ES vol. (ml)
EF (%)
ED vol. (ml)
ES vol. (ml)
EF (%)
ED vol. (ml)
ES vol. (ml)
EF (%)
241 127 160 131 131 199 155 233 204 226
144 67 111 62 62 114 100 140 92 116
40 48 30 53 53 43 35 40 55 49
197 133 145 133 112 183 149 190 179 229
136 70 102 101 85 121 111 168 108 165
31 47 30 24 24 34 25 12 39 28
172 142 152 126 138 176 140 209 162 222
121 79 109 92 101 101 107 150 89 139
29 44 28 26 26 43 24 28 45 37
Table 5 Paired t-test results comparing volume measurements from MR and FRU (ground truth) and from FRU (ground truth) and FRU (detected). t-Tests (p-values)
MR–FRU (ground truth) FRU (ground truth)–FRU (detected) MR–FRU (detected)
Average signed difference
ED vol.
ES vol.
EF
ED vol. (ml)
ES vol. (ml)
EF (%)
0.021 0.832 0.057
0.027 0.115 0.216
0.002 0.101 0.003
16 ± 18 1 ± 16 17 ± 24
16 ± 19 8 ± 14 8 ± 19
15 ± 11 4 ± 6 12 ± 9
able to automatically segment 3D left ventricle volume from sparsely sampled single-beat fast rotating ultrasound images. In contrast to earlier work in MR imaging based on tissue classification using intensity differences, the application to sparse, curved echocardiographic imaging planes is achieved by incorporation of local appearance patches, combined with dynamic programming techniques in the ASM fitting to consistently trace the endocardial border of the left ventricle, and a Gaussian propagation kernel to spread the sparse update throughout the 3D model surface. To enable segmentation of single-beat acquisitions, the proposed method is designed to tolerate high data sparsity and non-planar (curved) cross-sections. In the performed validation experiments, four obvious matching failures were observed, which were excluded from the quantitative evaluation of border positioning and volume errors. The average P2P and P2S errors for the remaining 24 data sets amounted to 3.1 mm and 1.7 mm, while the EF errors were 7.3 ± 4.9% (with a minimum and maximum error of 1.1% and 16.9%, respectively). Also, we have observed that the overall performance was slightly better for the ED phase than for ES, mainly due to the high visibility of the papillary muscles in ES. The median of each error indicates that only the EF error has a skewed error distribution, because the mean and median values are evidently different. The error distribution histograms and initial-final error comparison study have further confirmed this observation. This was mainly caused by the fact that EF is very sensitive to the simultaneous overestimation of ED volume and underestimation of ES in cases where the lateral walls are ill-defined due to acoustic dropout, which then leads to rather high errors. The evaluation indices, especially the P2S errors of approximately 1.7 ± 0.9 mm, are comparable to other volumetric 3D segmentation methods, which mostly achieve P2S error of between 2.2 mm and 3.7 mm as indicated in Noble and Boukerroui (2006), Hansegard et al. (2007a), van Stralen et al. (2007) and Corsaro et al. (2006)); however, it should be noted that these approaches are based on dense volumetric data, and thus require data acquired over multiple heartbeats. In the context of its intended application, a rapid assessment of global function from single-beat data, we conclude that the proposed method is capable of generating accurate segmentation results and acceptable quantification of LV volumes in
ED and ES, and thus provides useful global clinical measures in 24 out of the 28 tested cases. All four failed cases were caused by a low image quality where severe dropout was observed, which cause large areas of reduced image contrast (in most cases, a whole side of the myocardium is visually missing). Detection failures occurred when multiple frames in the same cardiac cycles contain dropout because a single low-quality frame or sometimes even two such frames with considerable in-frame segmentation error can still be corrected by those better-quality frames with the assistance of the successive propagation and ASM regularization. The single-beat LV measurement brings some advantages over volumetric methods not only in speed but also in the scope of applicability for patients with irregular heartbeats. The effectiveness of the single-beat method was demonstrated in Table 3 where only a slight but not significant improvement was observed using image frames from multiple cycles. Also, we have observed from first beat to second beat comparison that this method demonstrates similar performance when segmenting data from different cardiac cycles, which indicates that this method is stable with respect to image plane orientations. The method’s independence of image sparseness and image plane orientation gives an advantage over other segmentation methods for dense 3D ultrasound modalities such as matrix array transducers which compound data from multiple cycles. The proposed single-beat solution can still be effective in patients with irregular cardiac cycles. Sensitivity tests with respect to initial model placement indicate that correct model initialization is important to the overall performance of the method, yet the proposed method is still capable to converge from an acceptable degree of misplacement from the ideal initialization. Due to the sparsity of the data, we choose to apply an initialization validation scheme that combines translation initialization in multiple 2D slices with rotational offsets in 3D space. Experiments show that, to obtain an overall error of less than 150% of the error from optimal initialization, the method can tolerate 2D shift of 13 mm and orientation misalignment of 6° in each direction. This convergence test shows the practical value of this study considering the fact that manual initialization is typically done by indicating the apex and valvular ring, which in normal circumstances, would not exceed the error tolerance of this
592
M. Ma et al. / Medical Image Analysis 14 (2010) 582–593
study. Though the method provides good robustness to overcome positional misalignment, it is noted that its robustness with respect to the initial model orientation is relatively modest. Determined by the data nature in the form of curved 2D slices, it is difficult to correct orientation misalignment around the LV long axis during the process of segmentation, due to the lack of information in sparsely sampled regions. A subset of the data went through a volume comparison to semi-automatically generated LV volume from corresponding MR data as shown in Table 5. A statistically significant difference was observed (Table 5, p < 0.05) for FRU ground truth data compared to MR data. This difference in volume measurements between MR and echocardiography is a well known effect in the literature (van Stralen et al., 2007; Voormolen and Danilouchkine, 2007). However, this experiment also demonstrated that the difference between the FRU segmentation results generated using the proposed method and the ground-truth FRU delineations is not statistically significant (p > 0.05). From this, we conclude that the difference between the automated FRU segmentations and the MR measurement can be fully explained by the inherent modality differences, and is not caused by the proposed segmentation method. A limitation of the current study is the fact that the training and testing data was semi-manually delineated by a single expert, whereas delineation by multiple experts would enable an analysis of inter-observer variability. Second, as with most other ultrasound segmentation approaches, the performance of this method heavily relies on the quality of the image data. Also, the use of statistical shape models in medical image segmentation has inherent limitations, one of them being the large pathological shape deformations that might not be present in the training set. Small training set size may limit the generalization of the model. However, this study shows that when trained on a patient population of 27 patients suffering from ischemic heart disease, good segmentation performance can be achieved. This may well be explained by the fact that the reported 3D echo model only describes the endocardium, and that typical endocardial motions in hypertrophy and infarct patients are still included in the model (Leung and Bosch, 2008). Also, in literature a number of methods have been reported to overcome the rigidity of the shape model by applying a subsequent shape relaxation in the final segmentation step, e.g. Oost et al. (2006) and Kaus et al. (2004). However, due to the data sparsity we have not opted for a solution like this because the shape rigidity also helps in reconstructing the 3D shape in sparsely covered areas. Adding additional model flexibility may compromise shape reconstruction in these areas. Future work involves the extension of the method towards a fully automatic segmentation/quantification solution that includes an automatic initialization approach. A template matching based initialization method, which can effectively detect scale, orientation and translation of a target object is currently being investigated. In addition, the proposed method can be extended to a tracking scheme that uses the previously detected shape as an initialization of the next phase. Such full-beat studies would allow the evaluation of inter-beat variability and would circumvent the stitching artifacts that hamper the current volumetric acquisition approaches. It also has potential for application in patients suffering from atrial fibrillation or other cardiac arrhythmias. 6. Conclusions In this study, we propose a segmentation method for fast rotating echocardiography. From multi-slice single-beat apical sequences, the method is capable of generating the endocardial surface of the left ventricle for different cardiac phases using pre-
trained shape models and local appearance patches. The main advantage of this work is that it provides a possibility of efficiently quantifying global left ventricle function by measuring LV volume and ejection fraction in single-beat, sparsely sampled echocardiographic data. As such no compounding of data over several cycles is required, and the proposed method enables a rapid assessment of global cardiac function parameters. Four out of 28 cases were visually confirmed as detection failures. Error measurements excluding those detection failures yielded in P2P error of 3.1 mm, P2S error of 1.7 mm and EF error of 7.3%. The single versus multiple beat comparison demonstrated that single-beat application of the method does not suffer from performance loss due to data sparsity and is not sensitive to variations in image plane selection caused sampling differences between multiple cardiac cycles. With respect to sensitivity to initialization, the method proved to be robust to initial position and orientation within convergence intervals of about 26 mm (2 13 mm) for position, and 12 degrees (2 6°) for orientation. This indicates that the method combines robustness with respect to initialization with an acceptable accuracy.
Acknowledgements This research is supported by the Dutch Technology Foundation STW (Grant 06666), applied science division of NWO and the Technology Program of the Ministry of Economic Affairs. The authors also thank Boudewijn Krenning and Marco Voormolen for acquiring the patient data.
References Angelini, E.D., Laine, A.F., Takuma, S., Holmes, J.W., Homma, S., 2001. LV volume quantification via spatiotemporal analysis of real-time 3-D echocardiography. IEEE Trans. Med. Imag. 20 (6), 457–469. Blancher, J., Leger, C., Nguyen, L.D., 2004. Time-varying, 3D echocardiography using a fast-rotating probe. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 51 (5), 634–639. Bosch, J.G., 2008. Echocardiographic digital image processing and approaches to automated border detection. In: Otto, C.M. (Ed.), The Practice of Clinical Echocardiography, third ed. W.B. Saunders, pp. 262–282 (Chapter 12). Bosch, J.G., van Stralen, M., Voormolen, M.M., Krenning, B.J., Lancee, C.T., Reiber, J.H.C., van der Steen, A.F.W., de Jong, N., 2006. Novel spatiotemporal voxel interpolation with multibeat fusion for 3D echocardiography with irregular data distribution. Proc. SPIE Med. Imag. 6147, 61470Q. Cootes, T.F., Cooper, D., Taylor, C.J., Graham, J., 1992. A trainable method of parametric shape description. Image Vis. Comput. 10 (5), 289–294. Cootes, T.F., Taylor, C.J., Cooper, D., Graham, J., 1995. Active shape models – their training and application. Comput. Vis. Image Understanding 61 (1), 38–59. Corsi, C., Saracino, G., Sarti, A., Lamberti, C., 2002. Left ventricular volume estimation for real-time three-dimensional echocardiography. IEEE Trans. Med. Imag. 21 (9), 1202–1208. Corsaro, S., Mikula, K., Sarti, A., Sgallari, F., 2006. Semi-implicit co-volume method in 3-D image segmentation. SIAM J. Sci. Comput. 28 (6), 2248–2265. Coppini, G., Poli, R., 1995. Recovery of the 3-D shape of left ventricle from echocardiographyic images. IEEE Trans. Med. Imag. 14 (2), 301–317. Dijkstra, E.W., 1959. A note on two problems in connexion with graphs. Numer. Math. 1, 269–271. Djoa, K.K., de Jong, N., van Egmond, F.C., Kasprzak, J.D., Vletter, W.B., Lancee, C.T., van der Steen, A.F.W., Bom, N., Roelandt, J.R.T.C., 2000. A fast rotating scanning unit for real-time three-dimensional echo data acquisition. Ultrasound Med. Biol. 26 (5), 863–869. Hammoude, A., 1998. Endocardial border identification in two-dimensional echocardiographic images: review of methods. Comput. Med. Imag. Graph. 22 (3), 181–193. Hansegard, J., Orderud, F., Rabben, S.I., 2007a. Real-time active shape models for segmentation of 3D cardiac ultrasound. CAIP 2007, LNCS 4673, 157–164. Hansegard, J., Urheim, S., Lunde, K., Rabben, S.I., 2007b. Constrained active appearance models for segmentation of triplane echocardiograms. IEEE Trans. Med. Imag. 26 (10), 1391–1400. Kaus, M.R., von Berg, J., Weese, J., Niessen, W., Pekar, V., 2004. Automated segmentation of the left ventricle in cardiac MRI. Med. Image Anal. 8 (3), 245–254. Leung, K.Y.E., Bosch, J.G., 2008. Segmental wall motion classification in echocardiograms using compact shape descriptors. Acad. Radiol. 15 (11), 1416–1424.
M. Ma et al. / Medical Image Analysis 14 (2010) 582–593 Ma, M., van Stralen, M., Reiber, J.H.C., Bosch, J.G., Lelieveldt, B.P.F., 2009. Model driven quantification of left ventricular function from sparse single-beat 3D echocardiography. In: Proceedings of the SPIE Medical Imaging, paper # 725907. Mikula, K., Sarti, A., Sgallari, F., 2004. In: Suri, J. et al. (Eds.), Handbook of Medical Image Analysis: Segmentation and Registration Models. Marcel Dekker Inc., New York. Montagnat, J., Sermesant, M., Delingette, H., Malandain, G., Ayache, N., 2003. Anisotropic filtering for model-based segmentation of 4-D cylindrical echocardiographic images. Pattern Recognit. Lett. 24 (4–5), 815–828. Noble, A.J., Boukerroui, D., 2006. Ultrasound image segmentation: a survey. IEEE Trans. Med. Imag. 25 (8), 987–1010. Oost, E., Koning, G., Sonka, M., Oemrawsingh, P.V., Reiber, J.H.C., Lelieveldt, B.P.F., 2006. Automated contour detection in X-ray left ventricular angiograms using multiview active appearance models and dynamic programming. IEEE Trans. Med. Imag. 25 (9), 1158–1171. Ordas, S.C., van Assen, H.C., Puentes, J., Lelieveldt, B.P.F., Frangi, A.F., 2005. Parametric optimization of a model based segmentation algorithm for cardiac MR image analysis: a grid-computing approach. Stud. Health. Technol. Inform. 112, 146–156. Orderud, F., 2006. A framework for real-time left ventricular tracking in 3D+T echocardiography, using nonlinear deformable contours and Kalman filter based tracking. Comput. Cardiol. 17 (20), 125–128. Sarti, A., Malladi, R., Sethian, J.A., 2002. Subjective surfaces: a geometric model for boundary completion. Int. J. Comput. Vis. 46 (3), 201–221. Smith, S.W., Von Ramm, O.T., 1991. High-speed ultrasound volumetric imaging system–part 1: transducer design and beam steering. IEEE. Trans. Ultrason. Ferroelectr. Freq. Control 38 (2), 100–108. Song, M.Z., Haralick, R.M., Sheehan, F.H., Johnson, R.K., 2002. Integrated surface model optimization for freehand three-dimensional echocardiography. IEEE Trans. Med. Imag. 21 (9), 1077–1090.
593
Üzümcü, M., van der Geest, R.J., Sonka, M., Lamb, H.J., Reiber, J.H.C., Lelieveldt, B.P.F., 2005. Multi-view AAMs for simultaneous segmentation of cardiac 2- and 4chamber long axis MR images. Investig. Radiol. 40 (4), 195–203. van Assen, H.C., Danilouchkine, M.G., Frangi, A.F., Ordas, S., Westenberg, J.J.M., Reiber, J.H.C., Lelieveldt, B.P.F., 2006. SPASM: a 3D-ASM for segmentation of sparse and arbitrarily oriented cardiac MRI data. Med. Image Anal. 10 (2), 286–303. van Stralen, M., Bosch, J.G., Voormolen, M.M., van Burken, G., Krenning, B.J., van Geuns, R.J.M., Angelie, E., van der Geest, R.J., Lancee, C.T., de Jong, N., Reiber, J.H.C., 2005. Semi-automatic border detection method for left ventricular volume estimation in 4D ultrasound data. Proc. SPIE Med. Imag. 5747, 1457– 1467. van Stralen, M., Leung, K.Y.E., Voormolen, M.M., de Jong, N., van der Steen, A.F.W., Reiber, J.H.C., Bosch, J.G., 2007. Automatic segmentation of the left ventricle in 3D echocardiography using Active Appearance Models. In: Proceedings of 2007 IEEE Ultrasonics Symposium, pp. 1480–1483. van der Geest, R.J., Buller, V.G.M., Jansen, E., Lamb, H.J., Baur, L.H.B., van der Wall, E.E., de Roos, A., Reiber, J.H.C., 1997. Comparison between manual and automated analysis of left ventricular volume parameters from short axis MR images. J. Comput. Assist. Tomogr. 21 (5), 756–765. Voormolen, M.M., Krenning, B.J., Lancee, C.T., ten Cate, F.J., Roelandt, J.R.T.C., van der Steen, A.F.W., de Jong, N., 2006. Quantitative harmonic 3D echocardiography with a fast-rotating ultrasound transducer. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 53 (10), 1739–1747. Voormolen, M.M., Krenning, B.J., van Geuns, R.J., Borsboom, J., Lancée, C.T., ten Cate, F.J., Roelandt, J.R., van der Steen, A.F., de Jong, N., 2007. Efficient quantification of the left ventricular volume using 3D echocardiography: the minimal number of equiangular long-axis images for accurate quantification of the left ventricular volume. J. Am. Soc. Echocardiogr. 20 (4), 373–380. Voormolen, M.M., Danilouchkine, M.G., 2007. Aspects of left ventricular volume comparison between 3D echocardiography and MRI. J. Am. Soc. Echocardiogr. 20 (12), 1421–1422.