Reconstruction of volumetric ultrasound panorama based on improved 3D SIFT

Reconstruction of volumetric ultrasound panorama based on improved 3D SIFT

Computerized Medical Imaging and Graphics 33 (2009) 559–566 Contents lists available at ScienceDirect Computerized Medical Imaging and Graphics jour...

892KB Sizes 0 Downloads 43 Views

Computerized Medical Imaging and Graphics 33 (2009) 559–566

Contents lists available at ScienceDirect

Computerized Medical Imaging and Graphics journal homepage: www.elsevier.com/locate/compmedimag

Reconstruction of volumetric ultrasound panorama based on improved 3D SIFT Dong Ni a,∗ , Yim Pan Chui a , Yingge Qu a , Xuan Yang a , Jing Qin a , Tien-Tsin Wong a , Simon S.H. Ho b , Pheng Ann Heng a,c a

Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong SAR, China Union Hospital, Hong Kong SAR, China c Shenzhen Institute of Advanced Integration Technology, Chinese Academy of Sciences/The Chinese University of Hong Kong, China b

a r t i c l e

i n f o

Article history: Received 11 March 2009 Accepted 18 May 2009 Keywords: Ultrasound panorama Ultrasound volume registration 3D SIFT

a b s t r a c t Registration of ultrasound volumes is a key issue for the reconstruction of volumetric ultrasound panorama. In this paper, we propose an improved three-dimensional (3D) scale invariant feature transform (SIFT) algorithm to globally register ultrasound volumes acquired from dedicated ultrasound probe, where local deformations are corrected by block-based warping algorithm. Original SIFT algorithm is extended to 3D and improved by combining the SIFT detector with Rohr3D detector to extract complementary features and applying the diffusion distance algorithm for robust feature comparison. Extensive experiments have been performed on both phantom and clinical data sets to demonstrate the effectiveness and robustness of our approach. © 2009 Elsevier Ltd. All rights reserved.

1. Introduction Panoramic imaging is one of the key technologies used in widening the field of view of medical ultrasound images for clinical diagnosis and measurement. Two-dimensional (2D) ultrasound panorama has been prevalent in routine clinical practices [1], including precise measurement and tracing of the extended and tubular structures in abdomen [2], distance measurements in phantom study [3], detection of muscle tear [4], visualizing the musculoligamentous structures of the neck [5] and the anatomical structures in obstetrics [6]. Over the past few years, 3D ultrasound has been popular. Creation of 3D ultrasound panorama becomes possible and anticipant, which may lead to new clinical applications for ultrasound. Some research work on 3D ultrasound panorama has been conducted accordingly. Gee et al. [7] proposed an algorithm for the alignment of multiple freehand 3D ultrasound sweeps. An intensity based registration method was performed on a single dividing plane between two sweeps instead of the entire overlapping region to improve the registration speed. Poon et al. [8] proposed a system to create a mosaic view from a set of ultrasound volumes acquired from a dedicated 3D probe in order to provide higher imaging resolution in the elevation direction than a freehand probe. Two block-based registration methods were developed to correct the misalignment based6/5/2009 on measurements obtained from position trackers. Wachinger et al. [9] focused on

∗ Corresponding author. E-mail address: [email protected] (D. Ni). 0895-6111/$ – see front matter © 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.compmedimag.2009.05.006

the global registration strategy among multiple ultrasound volumes. Unfortunately, intensity-based algorithms including sum of squared differences, normalized cross-correlation, mutual information, and correlation-ratio may not perform well on ultrasound data due to the low signal to noise ratio and the existence of shadows, speckles and other noises in ultrasound data. On the other hand, position tracker based registration methods may not always be well-suited to routine clinical use, since the trackers may be affected by metals (for magnetic sensors) or sight occlusions (for optical sensors). In addition to the noisiness of ultrasound images, there are other problems which make the registration of ultrasound volumes difficult. Since the ultrasound probe has to be arbitrarily oriented during the acquisition of multiple volumes, such variation of rotation imposes a great challenge to traditional registration methods in feature matching. Besides, the intensity of the same structure may differ in multiple ultrasound volumes obtained under different scan angles, and this affects the registration accuracy as well. Recently, SIFT algorithm in 2D space has been proposed to detect and describe local features in images and successfully applied to object recognition [10], point tracking [11], panorama creation [12], etc. Chen and Tian [13] utilized SIFT for rapid pre-registration of multi-modality medical images. Moradi et al. [14] applied SIFT and a B-spline based interpolation technique for registering elastically deformed MRI images. Scovanner et al. [15]proposed a 3D SIFT descriptor for video analysis, and recognized actions based on the similarity of their feature representations. Compared with previous methods, SIFT features are invariant to rotation and provide robust feature matching across a substantial range of addition of noise and change in illumination (intensity). These characteristics

560

D. Ni et al. / Computerized Medical Imaging and Graphics 33 (2009) 559–566

make it appropriate for noisy ultrasound volumes acquired from arbitrary orientation. Although SIFT can meet many above-mentioned challenges in registration of ultrasound volumes, deformation of anatomical structures existing in ultrasound images may affect the comparison of SIFT descriptors. The common method for comparing SIFT descriptors is to use Euclidean L2 metric, which is one type of bin-tobin distances between histograms. The assumption of the L2 metric is that the SIFT descriptors are already aligned. This method is sensitive to deformation, intensity change and noise, which always exist in medical ultrasound images. Fortunately Ling and Okada proposed to use cross-bin distances, such as diffusion distance [16] and earth mover’s distance algorithms [17], for robust histogram comparison. All the above motivate us to apply SIFT-based registration to ultrasound data. In this paper, we propose a novel method to create 3D ultrasound panorama by adopting an improved 3D SIFT algorithm to globally register multiple ultrasound volumes and using block-based registration algorithms to correct the local deformations in the overlapping regions. To the best of our knowledge, it is the first work to apply the 3D SIFT method to ultrasound volume registration. We further improve the 3D SIFT algorithm according to characteristics of ultrasound. SIFT feature detector is combined with Rohr3D detector [18] in order to extract complementary features in ultrasound volumes. Due to the tissue deformation in ultrasound images, diffusion distance algorithm [16] is applied for robust and fast comparison of SIFT features. Extensive experiments demonstrate the feasibility of the proposed method. The paper is organized as follows. Implementation details of our method are presented in Section 2. Experiment results are discussed in Section 3. Finally, conclusion and future work are given in Section 4.

ference of Gaussian (DoG3D) detector with Rohr3D detector, and described as 3D SIFT features. Thirdly these features are matched using diffusion distance algorithm for globally registering multiple ultrasound volumes. Finally block-based registration algorithms are used to correct the local deformations in the overlapping regions and multiple ultrasound volumes are stitched to form panorama.

2. Method

2.2. 3D Feature detection and descriptor construction

The overview of the entire method is illustrated in Fig. 1. Multiple ultrasound volumes are acquired from 3D dedicated ultrasound probe in order to provide better imaging resolution in the elevation direction and faster imaging rate. Each volume partially overlaps one or more of the other volumes for the purpose of creating a panorama. We first pre-process the ultrasound data to remove the artifacts in the ultrasound volumes. Then 3D feature points are detected from multiple ultrasound volumes by combining 3D dif-

Two major steps, namely, 3D feature points detection and descriptor construction, are involved.

2.1. Ultrasound pre-processing An inherent characteristic of ultrasound imaging is the presence of shadows and speckles. Since the contact between a dedicated 3D probe and skin is not as good as that using a 2D probe, some of the emitted echoes are attenuated by air resulting in some large dark regions in the volume. These artifacts may greatly affect the detection and matching of feature points. Therefore it is essential to remove them before registration. For speckle reduction, a wellknown technique is to low-pass filter the image, e.g. by a Gaussian filter [19,20]. Since a Gaussian filter has been used in the SIFT algorithm, speckles are reduced effectively. The removal of shadows makes use of the fact that shadow is mainly caused by a loss of signal along the direction of the beam, due to strong acoustic reflection. We first determine a threshold value based on experimental ultrasound scans in advance. The ultrasound image is scanned along the echo direction from the bottom to the transducer surface; along the beam, we label all consecutive pixels with intensity value lower than the predefined threshold as shadows. The search terminates once a value higher than the threshold is found. Large dark regions are segmented and removed using the Hidden Markov Random Field Model and the Expectation-Maximization algorithm [21]. The results of these two steps are combined to generate a ultrasound mask (Fig. 2), which will be used in the process of feature detection.

2.2.1. Detection of 3D feature points In original 2D SIFT implementation [10], difference of Gaussian scale-space (DoG) is used to detect the feature points. This method detects mainly blobs which mean the points and/or regions in the image that are either brighter or darker than the surrounding. How-

Fig. 1. Flow diagram of proposed method.

D. Ni et al. / Computerized Medical Imaging and Graphics 33 (2009) 559–566

561

Fig. 2. (a) Original ultrasound image and (b) ultrasound mask generated from (a).

ever, this method performs poorly in the detection of corners, for example, the crossover of the vessels in ultrasound images. In [22], nine 6/5/2009erential operators for detecting corners are investigated. Operators Rohr3D [18] obtains the best results. In this regard, we extend the DoG detector to 3D (DoG3D) and combine it with the Rohr3D detector in order to extract complementary features in ultrasound volumes. In our DoG3D detector, a DoG scale-space is first built based on the original volume data [10]. Specifically, an input volume I(x, y, z) is first convolved with a 3D Gaussian filter G(x, y, z, k) of different scale, k, to obtain the scale space S(x, y, z, k) i.e., S(x, y, z, k) = G(x, y, z, k) × I(x, y, z).

(1)

Then the DoG volumes is taken as: D(x, y, z, kj ) = S(x, y, z, ki ) − S(x, y, z, kj )

(2)

where ki and kj are Gaussian kernels in the neighboring scale space. Fig. 3(a) shows the construction process. The left column shows the volume pyramid, and each level is created by convolving the original volume with a Gaussian of scale , k, k2  and so on. The right column shows the generated DoG from the difference of neighboring filtered volume on multi-scales. Feature candidates are taken as the local maxima/minima of the DoG scale-space D(x, y, z, ). To achieve this, each sample point is compared to its 80 neighbors, where 26 neighbors in the current volume, and each 27 neighbors in the scale above and below, as shown in Fig. 3(b). A feature candidate is selected only if it is larger than all of its neighbors or smaller than all of them. If the position of the feature candidate is outside the ultrasound mask generated in the previous step, this candidate is eliminated. The next step is to eliminate the false candidates with low contrast, or those poorly

localized along an edge. Edge responses in 3D space can be measured by a 3 × 3 Hessian matrix H = (Dij )3×3 , which describes the curvatures at the feature point in each 3D sub-plane,





Dxx

Dyx

Dzx

H = ⎣ Dxy

Dyy

Dzy ⎦

Dxz

Dyz

Dzz





(3)

where Dij is the second derivatives computed at the location and scale of the feature point in the DoG space. A poor candidate usually presents a high principal value across the edge direction but a small value in the perpendicular direction. Given the ordered eigenvalues of H (1 < 2 < 3 ) with corresponding eigenvectors (e1 , e2 , e3 ), the eigenvectors define an orthogonal coordinate system aligned with the direction of minimal (e1 ) and maximal (e3 ) curvature. Then we refer a feature point to be valid if the following constraint is satisfied: r = 3 /1 < rmax . This ratio can be obtained without explicitly computing the eigenvalues. The trace and determinant of the matrix H can be calculated according to the following equation: Tr(H) = 1 + 2 + 3 = Dxx + Dyy + Dzz Det(H) = 1 2 3 =

2 Dxx Dyy Dzz + 2Dxy Dyz Dxz − Dxx Dyz

(4) 2 − Dyy Dxz

2 − Dzz Dxy

If we define ˛ = 2 /1 , then (r + ˛ + 1)3 Tr(H)3 = r˛ Det(H)

(5)

Since the function defined by Eq. (5) increases as r and ˛ increase; while ˛ is not bigger than r, we can replace ˛ by r in order to obtain

Fig. 3. (a) Volume pyramid: each level is created by convolving the original volume with Gaussian of various scale, and to generate DoG volumes from adjacent scales. (b) An extremum (in black) is the maximum or minimum of its 80 neighbors (in grey). (c) Construction of 3D SIFT descriptor. The 3D gradient orientation is shown (right).

562

D. Ni et al. / Computerized Medical Imaging and Graphics 33 (2009) 559–566

the following inequality: Tr(H)3 (2rmax + 1)3 (r + r + 1)3 < < 2 2 Det(H) r rmax

(6)

where rmax is the upper threshold of r. Since the ultrasound image is noisy and not as sharp as the normal image, we lessen this threshold restriction so as to obtain more feature points in 3D space. In our experiments, the threshold rmax is set as 15. As for the Rohr3D detector [18], the cornerness value is defined as the determinant of the 3 × 3 matrix C which represents the averaged dyadic product of the grey-value gradient,

⎡ ⎢

Ixx

C = ⎣ Ix Iy Ix Iz



Ix Iy

Ix Iz

Iyy

Iy Iz ⎦

Iy Iz

Izz



(7)

Rohr3D = Det(C) → max

(8)

where Ii and Iii is the first and second order derivatives of the volume data in the space coordinates and the overline means average in a local neighborhood. For each voxel in the ultrasound volumes, the cornerness value is calculated in its neighboring regions. The voxel with the local maxima of the cornerness value is regarded as the corner feature point candidate. Local maxima in homogeneous regions due to noise, or pure edge points with large cornerness value should also be eliminated from the candidates. The methods are similar to the ones used in the DoG3D detector and details can be found in [18]. 2.2.2. Construction of 3D SIFT descriptor Based on [15], we adopt a modified 3D-SIFT feature descriptor to volumetric ultrasound data. Each feature point detected in the above section is first assigned with a dominant orientation ( ∗ , ∗ ) according to local image properties so that the feature descriptor can be represented relative to this orientation. As shown in Fig. 3(c), the gradient orientation in 3D space is represented by two angles:  denotes the longitude while  denotes the latitude. With the dominant orientation, the neighborhood region surrounding the feature point is then rotated so that the dominant orientation points to the direction of  ∗ = ∗ = 0. The rotation matrix is defined as:

⎡ ⎢

cos  ∗ cos ∗

R = ⎣ − sin  ∗ − cos  ∗ sin ∗

sin  ∗ cos ∗

sin ∗

cos  ∗

0

− sin  ∗ sin ∗

cos ∗

⎤ ⎥ ⎦

(9)

Then 4 × 4 × 4 sub-regions surrounding each feature point are sampled in the rotated neighborhood. The gradient magnitude and orientation of each sample in the subregions are calculated. For each sub-region, the magnitude of the gradient, weighted by a Gaussian window centred at the feature point, is added to the corresponding bin for the gradient orientation, where 8 bins (represent 360◦ ) are used for  and 4 bins (represent 180◦ ) are used for . Finally, the feature vector of each feature point is of 4 × 4 × 4 × 8 × 4 = 2048 dimensions. 2.3. Global registration Once the 3D SIFT features are extracted from each volume, multiple ultrasound volumes then can be matched. Given two ultrasound volumes A and B, a set of features is first obtained for each volume. Usually, there exist thousands of features, and they have to be validated before being used in the calculation of the transformation matrix. For one particular feature in volume A, its distance between each feature in volume B need to be calculated. The common method used before is the Euclidean L2 distance, which is one

kind of bin-to-bin distances. The assumption of the bin-to-bin algorithms is that histograms are already aligned, so that a bin in one histogram is only compared to the corresponding bin in the other histogram. Therefore these methods are sensitive to deformation, intensity change and noise. Medical ultrasound scanner is mainly used to visualize the soft organs so that deformations always exist, although the patients are required to hold their breath during the scanning. Intensity change and noise are also the inherent problems of ultrasound images. Cross-bin distances allow bins at different locations to be (partially) matched and therefore alleviate the deformation effect [16]. In order to resolve the deformation issue, we apply the diffusion distance [16] to efficiently compare the dissimilarity between two general histograms. One reason for this choice is that most other cross-bin distances are only efficient for onedimensional histograms (such as Earth Mover’s Distance [17]), but the 3D SIFT descriptors consist of five dimensions. Given two SIFT descriptors h1 (X) and h2 (X), where X is a five dimensional vector (x, y, z, , ), the diffusion distance (h1 , h2 ) is defined as: (h1 , h2 ) =

P 

(|dp (X)|)

(10)

p=0

where d0 (X) = h1 (X) − h2 (X)

(11)

dp (X) = [dp−1 (X) × ϕ(X, )]↓2 ϕ(X, ) =

1 (2 )

5/2

5

e((−X

p = 1, . . . , P

T X)/(2 2 ))

(12)

are different layers of the pyramid. The notation “↓2 ” denotes half size downsampling. P is the number of pyramid layers and  is the constant standard deviation for the Gaussian filter ϕ.  is a ground distance metric. In this paper we choose  as the L1 norm in order to speed up the distance calculation. A ratio is assigned to each feature by comparing the distance of the closest neighbor to that of the second-closest neighbor. A feature is recognized as a stable feature only if its ratio is larger than a threshold. Since we apply the 3D SIFT algorithm in a single modality (ultrasound only) registration, the difference of the intensity between the two correctly matched features is usually not very large. Therefore, we further eliminate false matching pairs by comparing the intensity difference between the two features by using the histogram distance of greylevels within a 8 × 8 region around the feature. Although most of the false feature pairs have been eliminated through the above step, some false feature pairs still exist. We further use RANSAC [23] to simultaneously solve the correspondence problem and estimate the 3D rigid transformation matrix related to every pair of volumes. The advantage of RANSAC is its ability to perform robust estimation with a high degree of accuracy even when outliers are presented. 2.4. Block-based local registration and stitching Block-based algorithms have been used in non-rigid medical image registration [24–26]. This method can obtain good performance if the global transformation between two volumes is small. However, with large global transformation, the blocks have to be moved in very wide neighborhoods, which may greatly slow down the computation and cause the displacement field to contain some severe outliers. These outliers may strongly affect the estimation of the transformation matrix. In this paper, the 3D SIFT-based algorithm described above serves as an initial step of calculating a global alignment in registering non-rigid motions. Then the block-based algorithms are used to correct the local deformation efficiently.

D. Ni et al. / Computerized Medical Imaging and Graphics 33 (2009) 559–566

Given two volumes to be registered, namely a fixed volume and a moving volume, our method includes five steps:

Table 1 Comparison of 3D SIFT and mutual information Data

(1) Select the region of interest (ROI) from the moving volume, divide the ROI into a grid of subvolumes. (2) Eliminate the subvolumes when values of entropies and variations of the subvolumes do not reach the minimum criteria. These kind of subvolumes generally do not contain useful information for registration. (3) Move each remaining subvolume within a search window and find the 3D translation vector T (Tx , Ty , Tz ) that maximize the mutual information (MI) similarity measure between the moved subvolume in the moving volume and the one in the fixed volume. Then the original and translated centers of the remaining subvolumes define two sets of control points. (4) Build a histogram for each translation direction, find the mean translation value and eliminate the control point pair if the distance between its translation value and the mean value is greater than a predetermined value. (5) Using the two sets of remaining control points, compute the 3D thin-plate splines (TPS) [27] transformation from the original volume to the new moving one. The steps other than step four are similar to the schemes described in [26]. Step four is used to remove the outliers and guarantee the robustness of the block-based algorithms. Once pairwise matches have been established between the ultrasound volumes, we then can obtain the global transformation matrix for each volume and stitch them into a panorama by different interpolation algorithms [1]. 3. Results and discussion 3.1. Data acquisition All ultrasound volumes in our experiments were acquired using a GE Voluson 730 ultrasound scanner with a dedicated 3D ultrasound probe (RAB2-5L Convex Volume Probe). Two sets of data were collected, where one included 5 ultrasound volumes obtained from a phantom (CIRS Model 057, mimicking human liver tissues) with different scan angles, and the other set included 3 ultrasound volumes obtained by scanning the liver of a patient. In our experiments, about 50% overlapping was used to guarantee sufficient number of keypoint pairs appear in neighboring volumes. In order to evaluate the precision of ultrasound panorama, the corresponding CT volume was acquired from a GE Lightspeed 16-slice multi-detector on the same phantom. 3.2. Evaluation of registration based on improved 3D SIFT Registration of ultrasound volumes has been a key issue for the success of generating a panorama. In this section, registration of ultrasound volumes is evaluated elaborately. The success rates of registration using 3D SIFT and mutual information (MI) are compared. The improvement in registration by SIFT and block-based nonrigid registration algorithm is illustrated by calculating the similarity measure between the features in the overlapping regions. Stability of feature matching under different rotations is evaluated and compared respectively using original 3D SIFT, 3D SIFT with combined features (3D CSIFT) and 3D CSIFT with diffusion distance(3D CDDSIFT). 3.2.1. Comparison of 3D SIFT and mutual information Mutual information has been successfully applied to a wide variety of image modalities, including the registration of volumetric

563

Phantom data

Clinical data

Method

MI

3D SIFT

MI

3D SIFT

Success rate

34%

83%

28%

85%

ultrasound scans [28,29]. However, the registration of ultrasound volumes based on MI may be affected by noise, shadows and rotation of scanner existing in ultrasound imaging. We compare the success rate of global rigid registration using proposed improved 3D SIFT and MI. Given two ultrasound volumes A and B, which are first registered and the ground truth transformation is obtained. To assess the registration success rate, the ground truth transformation is perturbed by adding random Gaussian noise to each of the three translation and three rotation parameters (standard deviation 10 mm or 10◦ ) to produce 50 starting positions. Failed registrations are found by visual inspection. Both phantom and clinical data are tested and the results are shown in Table 1. Success rate of registration using 3D SIFT is much better than the one using MI since local maxima of MI can occur when the image is noisy or includes artifacts, or the transformation away from the ground truth is great. 3.2.2. Improvement in registration of overlapping regions In this paper, we propose to rigidly register the ultrasound volume based on 3D SIFT, which serves as the initial step of block-based deformable registration. The goal of such two-step registration is to improve the alignment of the features in the overlapping regions. We evaluate the registration improvement by calculating a similarity measure of overlapping regions before registration, after rigid registration using 3D SIFT and after deformable registration using block-based algorithm, respectively. In this case, the normalized correlation coefficient is used. The mean of the correlation coefficients is calculated and listed in Table 2. For both tests of phantom and clinical data, results are improved greatly after SIFT-based rigid registration and block-based warping registration after SIFT-based rigid registration achieves the best results. For phantom data, the results after block-based registration improves less than the clinical data. The reason is that clinical data has more deformation than the phantom data due to the respiration. 3.2.3. Stability of feature matching with rotation In this experiment, we evaluate and compare the stability of feature matching under rotations respectively using 3D SIFT, 3D CSIFT and 3D CDDSIFT. Given two ultrasound volumes A and B, which are first registered. B is resampled according to the transformation matrix between A and B in order to obtain a resampled volume B . We rotate volume B with various angles around x-axis, y-axis and z-axis respectively. The numbers of matched keypoints between A and the rotated B are then calculated for each of these rotations. The result is normalized with respect to the number of matched keypoints under no rotation, to show the stability under rotation. In Fig. 4, the left sub-figure shows the results from phantom data sets and the right one shows the results from clinical data sets. As shown in this figure, for each method, when the rotation angle is below 10◦ , the percentage of matched feature points is above 50%. Table 2 Improvement in registration Data

Phantom data Clinical data

Similarity of overlapping regions Before registration

3D SIFT

Block-based registration

0.893 0.895

0.982 0.951

0.984 0.962

564

D. Ni et al. / Computerized Medical Imaging and Graphics 33 (2009) 559–566

Fig. 4. Stability of matched features for rotated ultrasound volumes. The volume is rotated by 0◦ , 5◦ , 10◦ , 15◦ , 20◦ around x-, y- and z-axis respectively. (a) Rotation around x-axis, (b) rotation around y-axis and (c) rotation around z-axis.

When the rotation angle is around 15◦ , the ratio can still be maintained at around 40%. It is observed that when the volume is rotated around z-axis, the results outperform others. The reason is that the image resolution along z-axis is not as good as that along x-axis or y-axis. For phantom data, the 3D CSIFT and 3D CDDSIFT methods perform a little better than 3D SIFT. However, for clinical data, the

matches ratio using 3D CDDSIFT is about 20% higher than 3D SIFT, better than the performance for phantom data. The reason is that there exists greater deformation and more noise in clinical data than phantom data, which proves that diffusion distance algorithm is more appropriate for the data with more deformation and noise than Euclidean distance algorithm.

Fig. 5. (a) and (c) 3D feature point matching in phantom and clinical data respectively. (b) and (d) Stitched results of the phantom and clinical data sets.

D. Ni et al. / Computerized Medical Imaging and Graphics 33 (2009) 559–566

565

Fig. 6. Physical distance between pairs of distinguishable feature points (shown by crosses) from both CT (right) and US (left) are measured.

3.3. Stitching of multiple ultrasound volumes

the block-based registration corrects the local small deformation caused by the probe pressure on phantom when scanning.

We have applied the proposed method on data sets of the phantom and clinical liver. Most of the parameters stated in the paper are specified according to the state-of-the-art 2D SIFT algorithms [10]. Fig. 5(a) and (c) shows part of the matched features in phantom and clinical data sets respectively. Note that the data is actually a volume, a multi-planar reconstruction view is used to illustrate the matched feature point pairs. The cross represents a 3D SIFT feature, and each line links a pair of matched feature points. Fig. 5(b) shows the resultant ultrasound panorama (one of the cross-sections is shown) obtained by stitching five phantom volumes. Fig. 5(d) shows the ultrasound panorama obtained from the stitching of three clinical liver volumes. Using a 3GHz Intel Pentium CPU and 3G RAM-based system, it takes about 1 min to register two neighboring volumes (256 × 191 × 248) using SIFT methods and about 5 min to finish the deformable registration. 3.4. Precision of ultrasound panorama An important clinical application of ultrasound panorama is to provide a quantitive measurement of large organs. To verify the precision of the proposed method, we use the CT volume, obtained by scanning the same phantom, as a ground truth for evaluating the ultrasound panorama generated in our system. The CT volume covers the whole phantom. In both the ultrasound panorama and CT volume, distinguishable feature points were selected manually from the same anatomical position. Physical distance between each point pair, within the same modality, is calculated. Fig. 6 shows the measurement of a line in the CT volume and that of the corresponding line in the ultrasound panorama volume. This physical distance provides a quantitive measurement of the distortion resulted from stitching multiple ultrasound volumes. Table 3 lists the measurement results. The length of three lines are measured in our experiments. The mean error of only using SIFT methods is 2%, which is comparable to the 2.5% error in the results based on a position tracker [8]. The mean error after using blockbased deformable registration is 0.7%. This improvement is because Table 3 Precision evaluation. Physical distances of three lines are measured respectively in CT volume and ultrasound panorama Line

1 2 3

Length in CT (mm)

138.84 155.77 130.93

Length in US (mm) (Error) 3D SIFT

Block method

141.34 (1.8%) 152.48 (2.1%) 128.23 (2.0%)

139.81 (0.7%) 154.52 (0.8%) 130.27 (0.5%)

4. Conclusion In this paper, we have proposed an improved 3D SIFT-based method for generating ultrasound panorama from multiple volume data sets. Ultrasound volumes are pre-processed to reduce the false detection and matching of keypoints obtained from 3D SIFT. 3D SIFT detector combined with Rohr3D detector are used to detect different kinds of features. Meanwhile, diffusion distance is applied to improve the robustness of feature matching due to the deformation, noise and intensity change in ultrasound volumes. After the rigid registration based on improved 3D SIFT, we apply block-based algorithm to correct the local deformations. Experiments demonstrate this two-step registration method can improve the precision of the generated ultrasound panorama. Our approach permits the generation of panorama with high matching stability, because the proposed 3D SIFT feature has merit of invariance to rotation and provides robust feature matching across a substantial range of addition of noise and change in illumination. We evaluated the registration of ultrasound volumes based on improved 3D SIFT elaborately and tested the precision of our approach by comparing the stitched ultrasound volume with the CT volume. Results have shown promising feasibility of our method in application to rigid and deformable registration of ultrasound volumes. In the future we will apply the improved 3D SIFT methods to other medical imaging modalities. We will further study the performance of feature matching with soft tissue deformation. Acknowledgement The work described in this paper was fully supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region (Project No. CUHK4461/05M). This work is also affiliated with the Virtual Reality, Visualization and Imaging Research Center at The Chinese University of Hong Kong as well CUHK MoE-Microsoft Key Laboratory of Human-Centric Computing and Interface Technologies. References [1] Wachinger C, Three-dimensional ultrasound mosaicing, Ph.D. thesis, Technical University of Munich (2007). [2] Kim SH, Choi BI, Kim KW, Lee KH, Han JK. Extended field-of-view sonography: advantages in abdominal applications. J Ultrasound Med 2003;22(4):385–94. [3] Ying M, Sin M-H. Comparison of extended field of view and dual image ultrasound techniques: accuracy and reliability of distance measurements in phantom study. Ultrasound Med Biol 2005;31(1):79–83. [4] Peetrons P. Ultrasound of muscles. Eur Radiol 2002;12(1):35–43.

566

D. Ni et al. / Computerized Medical Imaging and Graphics 33 (2009) 559–566

[5] Leung Y, Roshier A, Johnson S, Kerslake R, McNally D. Demonstration of the appearance of the paraspinal musculoligamentous structures of the cervical spine using ultrasound. Clin Anatomy 2005;18(2):96–103. [6] Henrich W, Schmider A, Kjos S, Tutschek B, Dudenhausen JW. Advantages of and applications for extended field-of-view ultrasound in obstetrics. Arch Gynecol Obstetrics 2003;268(2):121–7. [7] Gee A, Treece G, Prager R, Cash C, Berman L. Rapid registration for wide field of view freehand three-dimensional ultrasound. IEEE Trans Med Imag 2003;22(11):1344–57. [8] Poon T, Rohling R. Three-dimensional extended field-of-view ultrasound. Ultrasound Med Biol 2003;32(3):357–69. [9] Wachinger C, Wein W, Navab N. Three-dimensional ultrasound mosaicing. In: Proceedings of the international conference on medical image computing and computer-assisted intervention (MICCAI). 2007. p. 327–35. [10] Lowe DG. Distinctive image features from scale-invariant keypoints. Int J Comput Vision 2004;60(2):91–110. [11] Lazebnik S, Schmid C, Ponce J. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol. 2. 2006. p. 2169–78. [12] Brown M, Lowe D. Recognising panoramas. In: Proceedings of the international conference on computer vision (ICCV), vol. 2. 2003. p. 1218–25. [13] Chen J, Tian J. Rapid multi-modality preregistration based on sift descriptor. In: Proceedings of the international conference of the IEEE engineering in medicine and biology society (EMBS). 2006. p. 1437–40. [14] Moradi M, Abolmaesoumi P, Mousavi P. Deformable registration using scale space keypoints. In: Proceedings of SPIE medical imaging: image process, vol. 6144. 2006. p. 791–8. [15] Scovanner P, Ali S, Shah M. A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of the 15th international conference on multimedia. 2007. p. 357–60. [16] Ling H, Okada K. Diffusion distance for histogram comparison. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR). 2006. p. 246–53.

[17] Ling H, Okada K. An efficient earth mover’s distance algorithm for robust histogram comparison. IEEE Trans Pattern Anal Mach Intell 2007;29(5):840–53. [18] Rohr K. On 3D differential operators for detecting point landmarks. Image Vision Computing 1997;15(3):219–33. [19] Penney G, Blackall J, Hamady M, Sabharwal T, Adam A, Hawkes D. Registration of freehand 3D ultrasound and magnetic resonance liver images. Med Image Anal 2004;8:81–91. [20] Huang Q, Zheng Y, Lu M, Wang T, Chen S. A new adaptive interpolation algorithm for 3D ultrasound imaging with speckle reduction and edge preservation. Comput Med Imag Graphics 2009;33(2):100–10. [21] Zhang Y, Brady M, Smith SM. Segmentation of brain mr images through a hidden markov random field model and the expectation maximization algorithm. IEEE Trans Med Imag 2001;20(1):45–57. [22] Hartkens T, Rohr K, Stiehl HS. Evaluation of 3D operators for the detection of anatomical point landmarks in mr and ct images. Computer Vision Image Understanding 2002;86(2):118–36. [23] Hartley R, Zisserman A. Multiple view geometry in computer vision. 2nd edition Cambridge University Press; 2003. [24] Gaens T, Maes F, Vandermeulen D, Suetens P. Non-rigid multimodal image registration using mutual information. In: Proceedings of the international conference on medical image computing and computer-assisted intervention (MICCAI). 1998. p. 1099–106. [25] Maintz JBA, Meijering EHW, Viergever MA. General multimodal elastic registration based on mutual information. In: Proceedings of SPIE medical imaging: image processing. 1998. p. 144–54. [26] Krucker J, LeCarpentier G, Fowlkes J, Carson P. Rapid elastic image registration for 3D ultrasound. IEEE Trans Med Imag 2002;21(11):1384–94. [27] Bookstein FL. Morphometric tools for landmark data: geometry and biology. Cambridge University Press; 1991. [28] Meyer C, Boes J, Kim B, Bland P, Lecarpentier G, Fowlkes J. Semiautomatic registration of volumetric ultrasound scans. Ultrasound Med Biol 1999;25:339–47. [29] Zagrodsky V, Shekhar R, Cornhill JF. Mutual information based registration of cardiac ultrasound volumes. In: Proceedings of SPIE medical imaging: image processing, vol. 3979. 2000. p. 1605–14.