G Model
ARTICLE IN PRESS
ARTMED-1514; No. of Pages 9
Artificial Intelligence in Medicine xxx (2017) xxx–xxx
Contents lists available at ScienceDirect
Artificial Intelligence in Medicine journal homepage: www.elsevier.com/locate/aiim
Automatic segmentation of liver tumors from multiphase contrast-enhanced CT images based on FCNs Changjian Sun a , Shuxu Guo a , Huimao Zhang b , Jing Li b , Meimei Chen c , Shuzhi Ma a , Lanyi Jin a , Xiaoming Liu a , Xueyan Li a,∗ , Xiaohua Qian d,∗ a
College of Electronic Science and Engineering, Jilin University, Changchun, China Radiology, The First Hospital of Jilin University, Changchun, China c College of Communication Engineering, Jilin University, Changchun, China d Radiology, Wake Forest School of Medicine, Winston Salem, USA b
a r t i c l e
i n f o
Article history: Received 30 December 2016 Received in revised form 28 February 2017 Accepted 10 March 2017 Keywords: Fully convolutional networks Multi-channel Feature fusion
a b s t r a c t This paper presents a novel, fully automatic approach based on a fully convolutional network (FCN) for segmenting liver tumors from CT images. Specifically, we designed a multi-channel fully convolutional network (MC-FCN) to segment liver tumors from multiphase contrast-enhanced CT images. Because each phase of contrast-enhanced data provides distinct information on pathological features, we trained one network for each phase of the CT images and fused their high-layer features together. The proposed approach was validated on CT images taken from two databases: 3Dircadb and JDRD. In the case of 3Dircadb, using the FCN, the mean ratios of the volumetric overlap error (VOE), relative volume difference (RVD), average symmetric surface distance (ASD), root mean square symmetric surface distance (RMSD) and maximum symmetric surface distance (MSSD) were 15.6 ± 4.3%, 5.8 ± 3.5%, 2.0 ± 0.9%, 2.9 ± 1.5 mm, 7.1 ± 6.2 mm, respectively. For JDRD, using the MC-FCN, the mean ratios of VOE, RVD, ASD, RMSD, and MSSD were 8.1 ± 4.5%, 1.7 ± 1.0%, 1.5 ± 0.7%, 2.0 ± 1.2 mm, 5.2 ± 6.4 mm, respectively. The test results demonstrate that the MC-FCN model provides greater accuracy and robustness than previous methods. © 2017 Elsevier B.V. All rights reserved.
1. Introduction The prevention and treatment of liver diseases is a major focus of current research for clinical diagnosis [1,2]. Automatic, accurate, and robust methods for liver tumor segmentation and volumetry are prerequisites for computer-aided diagnosis, tumor classification, cancer staging and radiomics analysis. Advances in medical imaging technology have made it possible to obtain high-resolution data [3,4]. The improvement in the signal-to-noise ratio and high spatial resolution of contrast-enhanced computed tomography (CECT) are particularly useful in confirming lesion size and location [5]. According to the literature, CECT is the most commonly used modality for evaluating suspected liver metastases, hepatocellular carcinoma, and cholangiocarcinoma [6–8]. Segmentation of liver tumors from CT images plays an important role in many liverrelated clinical applications. However, the size of the liver varies with sex, age and body shape, and the contrast between malignant
∗ Corresponding authors. E-mail addresses:
[email protected] (X. Li),
[email protected] (X. Qian).
tissue and normal tissue is often low. Thus, detection of malignant tissue is difficult. Several state-of-the-art algorithms, including thresholding, region-based methods, active-contour models, graph cut and machine learning, have been proposed to solve this problem [9]. Thresholding techniques set all pixels whose intensity values are above a threshold to a foreground value and all the remaining pixels to a background value. For example, Moltz et al. combined a threshold-based approach with model-based morphological processing adapted to liver metastases and achieved semi-automatic segmentation of liver tumors [10]. Region-based methods include region growing, region splitting and merging and watershed transformation. Region growing is a simple and fast algorithm to segment an image and generally involves selecting a seed point and then expanding it. More advanced methods, such as iterative relative fuzzy connectedness (IRFC), have also been used for liver lesion segmentation [11]. The watershed transformation has been recognized as a powerful segmentation method due to its simplicity, speed, and complete division of the image. For instance, Yan et al. accurately segmented 3D liver metastases in volumetric CT images based on a marker-controlled watershed transformation [12]. Xu and Prince used a gradient vector flow (GVF)
http://dx.doi.org/10.1016/j.artmed.2017.03.008 0933-3657/© 2017 Elsevier B.V. All rights reserved.
Please cite this article in press as: Sun C, et al. Automatic segmentation of liver tumors from multiphase contrast-enhanced CT images based on FCNs. Artif Intell Med (2017), http://dx.doi.org/10.1016/j.artmed.2017.03.008
G Model ARTMED-1514; No. of Pages 9
ARTICLE IN PRESS
2
C. Sun et al. / Artificial Intelligence in Medicine xxx (2017) xxx–xxx
active-contour technique to perform hepatic segmentation [13], and the liver tumor was then delineated using a statistical modelbased approach [14]. Li et al. proposed a likelihood and local constraint level set model for liver tumor detection [15]. This approach is an extension of the classic graph cut developed by Boykov et al. [16] and was proposed as an efficient way to minimize a larger class of energy functions. Linguraru et al. [17] integrated a generic affine-invariant shape parameterization method into a geodesic active-contour model for detecting the liver, followed by liver tumor segmentation using graph cut. Currently, machine learning plays an essential role in the medical imaging segmentation field. A variety of machine-learning methods have also been developed for automatic or semi-automatic segmentation of liver tumors. Massoptier et al. applied K-means clustering to extract liver tumors [18], and W. Huang used random feature subspace ensemble-based extreme machine learning for liver tumor detection and segmentation [19]. Most methods based on machine learning are more efficient than traditional methods but are sensitive to noise or inefficient, and it is difficult to precisely select the representation of the liver tumor [20]. Deep learning has been applied to a wide variety of problems and has surpassed the previous state-of-the-art performance [21–23], which motivates us to apply this approach to fully automatic liver tumor segmentation in CT. The results obtained using fully convolutional networks (FCNs) trained end to end, and the pixel to pixel on semantic segmentation exceed the previous best results without further machinery [24]. This is the first work to train FCNs end to end (1) for pixel-wise prediction and (2) from supervised pre-training [25]. In addition, new segmentation methods based on FCNs were developed for medical-image analysis with highly competitive results. A. Ben-Cohen et al. explored an FCN for the task of liver segmentation and liver-metastasis detection in CT examinations [26]. Christ et al. presented a method to automatically segment liver and lesions in CT abdominal images using cascaded fully convolutional neural networks (CFCNs) and dense 3D conditional random fields (CRFs) [27]. A CECT image is acquired in three phases: arterial (ART) phase, portal venous (PV) phase, and delayed (DL) phase, as shown in Fig. 1. The study of features from CECT images has been relatively neglected. The methods in previous studies [5,28] used CECT image information to perform the detection of hepatocellular carcinomas and liver segmentation, taking advantage of different characteristics of the ART phase, PV phase, and DL phase. Inspired by such methods, we proposed an automatic liver tumor segmentation method from CECT images with multi-channel fully convolutional networks (MC-FCNs). An MC-FCN has three channels that can independently train CT images to extract image features. Then, this model can implement feature fusion of CECT images in the
high-level layers of the network-training process. Thus, the accuracy of the segmentation of liver lesions is improved. To the best of our knowledge, MC-FCN has not been previously employed for the segmentation of liver tumors in CT images. The rest of this paper is organized as follows: In Section II, we describe the proposed method in detail. Section III illustrates the results and provides a comparative discussion of the proposed algorithm with existing methods. We conclude the paper in Section IV. 2. Methods The first part of this section provides an introduction to singlechannel FCN. The second part explains the improvements made to the structure to train on multi-phase enhanced CT images. The segmentation process can be divided into two parts: training of the model for the weight-optimization process and segmentation based on the trained model; see Fig. 2. 2.1. Single-channel fully convolutional networks FCNs can efficiently generate a spatial score map for each pixel. The input to an FCN can be of arbitrary size, and its output is of a corresponding size. In this work, we converted AlexNet [21] into a fully convolutional network by transforming fully connected layers into convolution layers. We attempted to improve the AlexNet structure for the segmentation of liver lesions because it is a successful model that has been proven to achieve good results in image classification. The network structure of the FCN is shown in Fig. 3. The single-channel FCN network structure contains 8 convolution layers (C1-C5, FC1-FC3), 3 subsampling layers (S1-S3), 3 deconvolution layers (D1-D3) and 2 feature-fusion layers (F1-F2). 1) Convolution layers (C1-C5, FC1-FC3): Convolution layers using different convolution-kernel sizes perform the convolution operation with the output of the previous layer. The convolution kernels are used to extract the features of the input image and maintain the spatial correlation. 2) Subsampling layers (S1-S3): Subsampling layers are non-linear downsampling layers, which increase the robustness and reduce the number of network parameters. The subsampling layers reduce the size of the feature map without reducing its resolution. The input of each layer is obtained from the result of the previous layer through an activation function. 3) Activation function: In this work, we used the rectified linear units (ReLU) function as the activation function. Compared with other activation functions, ReLU can reduce the
Fig. 1. From left to right: (a) PV, (b) ART, (c) DL.
Please cite this article in press as: Sun C, et al. Automatic segmentation of liver tumors from multiphase contrast-enhanced CT images based on FCNs. Artif Intell Med (2017), http://dx.doi.org/10.1016/j.artmed.2017.03.008
G Model ARTMED-1514; No. of Pages 9
ARTICLE IN PRESS C. Sun et al. / Artificial Intelligence in Medicine xxx (2017) xxx–xxx
3
Fig. 2. Part A is the training process, and part B is the segmentation process.
Fig. 3. C1-C5 and FC1-FC3 are convolution layers, where FC1-FC3 are converted from fully connected layers of the CNN. The subsampling layers S1, S2 and S3 are added after C1, C2 and C5, respectively. D1, D2 and D3 are deconvolution layers. F1 and F2 are fusion layers. F1 is fused from C4 and D1, and F2 is fused from S1 and D2.
vanishing-gradient effect and speed up the rate of convergence of the model [29].
Taking C1 and S1 as an example, an image of size 512*512 convolves with 96 convolution kernels, each of size 11*11. A feature map of size 175*175*96 is generated. In the S1 layer, the size is reduced to 87*87*96 by downsampling (see Fig. 4).
4) Deconvolution layers (D1-D3): After feature extraction of multiple convolution layers, the feature map was input to deconvolution layers for bilinear interpolation and to obtain a coarse layer with semantic information. 5) Fusion layer (F1-F2): The operations of FCN layers cause the feature map to lose too much location information. To obtain more accurate segmentation results, we combined semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to allow the model to make local
predictions that respect the global structure. This process was achieved using the fusion layer.
Taking FC3, D1, and fusion layer F1 as an example, the structure is shown in Fig. 5. The output of FC3 is of size 16*16*2, which is smaller than the original image. Then, we can obtain a deconvolution layer D1 of size 33*33*2 and perform fusion with C4. The size of C4 (43*43*384) is different from that of D1 (33*33*2). First, we convolve C4 with a 3*3 convolution kernel to obtain a 43*43*2 feature map. Second, we reduce its size to 33*33*2 with a crop operation and then fuse it with layer D1. Thus, we obtain the first fusion layer, F1. The fusion results are deconvoluted to obtain the output of the D2 layer. The deconvolution layer D2 and the subsampling layer S1 are subjected to feature fusion to obtain the final features. The detailed parameters are shown in Table 1.
Please cite this article in press as: Sun C, et al. Automatic segmentation of liver tumors from multiphase contrast-enhanced CT images based on FCNs. Artif Intell Med (2017), http://dx.doi.org/10.1016/j.artmed.2017.03.008
G Model ARTMED-1514; No. of Pages 9
ARTICLE IN PRESS
4
C. Sun et al. / Artificial Intelligence in Medicine xxx (2017) xxx–xxx
Fig. 4. Structure of C1 and S1.
Fig. 5. Structure of a deconvolution layer and a fusion layer.
Fig. 6. Convolution kernels trained on CT scan images of different contrast-enhancement phases. From left to right: (a) PV, (b) ART, (c) DL.
2.2. Multi-channel fully convolutional networks When training the FCN model, we found that when the same network structure was trained using the three phases of CT images of the same slice, the trained weights obtained for the same model were significantly different. A visualization of the filters is shown in Fig. 6. When using single-phase CECT images to train the FCN model, it is only possible to extract the lesion features of one enhancement at a time. This approach cannot make full use of the different imaging characteristics of the three-phase CECT images. Inspired by the improved accuracy of an FCN that combines fine layers and coarse layers according to the imaging characteristics of the three enhancement phases, we improved the single-channel FCN network by developing a multi-channel training model to be trained on three-phase CECT images. In the high-level layer after feature extraction, feature fusion is performed on multi-phase CECT images to improve the segmentation accuracy. Fig. 7 illustrates the proposed MC-FCN network structure.
In contrast to the single-channel FCN structure, the MC-FCN has three FCN channels with independent training parameters, which can be used for feature extraction and parameter training on threephase CECT images. For the model to receive the data from three channels at the same time, the three groups of different-phase training images corresponding to each plain CT are stored in three dimensions of the dataset, and training data of size 512*512*3 are obtained. In the data-input layer of the model, a Slice layer is added, and the Slice layer can split the input image data into three 512*512*1 image-data blocks and input them into the corresponding FCN channels. In the feature-output layer of the FCN, three sets of feature maps of size 543*543*2 are obtained. Then, we use a feature-fusion layer and a cropping layer to obtain a 512*512*2 feature map. This feature is imported into the softmax classifier to obtain the probabilistic diagram of this group of images. The training process is performed by label backpropagation. The detailed parameters of the MF-FCN are shown in Table 2.
Please cite this article in press as: Sun C, et al. Automatic segmentation of liver tumors from multiphase contrast-enhanced CT images based on FCNs. Artif Intell Med (2017), http://dx.doi.org/10.1016/j.artmed.2017.03.008
G Model
ARTICLE IN PRESS
ARTMED-1514; No. of Pages 9
C. Sun et al. / Artificial Intelligence in Medicine xxx (2017) xxx–xxx
5
Fig. 7. Structure of the MC-FCN.
Table 1 Detailed parameters of the FCN model.
2.3. Model training and model-based segmentation
Layer name
Kernel size
Stride
Pad
Output size
Data Conv1 Sub1 Conv2 Sub2 Conv3 Conv4 Conv5 Sub5 Fc1 Fc2 Fc3 Dconv1 Fuse1 Dconv2 Fuse2 Dconv3 Crop
– 11 3 5 3 3 3 3 3 6 1 1 3 – 3 – 15 –
– 4 2 1 2 1 1 1 2 1 1 1 2 – 2 – 8 –
– 99 0 2 0 1 1 1 0 0 0 0 0 – 0 – 0 –
512*512*1 175*175*96 87*87*96 87*87*256 43*43*96 43*43*384 43*43*384 43*43*256 21*21*256 16*16*4096 16*16*4096 16*16*2 33*33*2 33*33*2 67*67*2 67*67*2 543*543*2 512*512*2
Table 2 Detailed parameters of the MC-FCN model. Layer name
Kernel Size
Stride
Pad
Output size
Data Slice Conv1 1/Conv1 1/Conv1 1 Sub1 1/Sub1 1/Sub1 1 Conv2 1/Conv2 1/Conv2 1 Sub2 1/Sub2 1/Sub2 1 Conv3 1/Conv3 1/Conv3 1 Conv4 1/Conv4 1/Conv4 1 Conv5 1/Conv5 1/Conv5 1 Sub5 1/Sub5 1/Sub5 1 Fc1 1/Fc1 1/Fc1 1 Fc2 1/Fc2 1/Fc2 1 Fc3 1/Fc3 1/Fc3 1 Dconv1 1/Dconv1 1/Dconv1 1 Fuse1 1/Fuse1 1/Fuse1 1 Dconv2 1/Dconv2 1/Dconv2 1 Fuse2 1/Fuse2 1/Fuse2 1 Dconv3 1/Dconv3 1/Dconv3 1 Fusion Crop
– – 7 3 5 3 3 3 3 3 6 1 1 3 – 3 – 15 – –
– – 4 2 1 2 1 1 1 2 1 1 1 2 – 2 – 8 – –
– – 97 0 2 0 1 1 1 0 0 0 0 0 – 0 – 0 – –
512*512*3 512*512*1*3 175*175*96 87*87*96 87*87*256 43*43*96 43*43*384 43*43*384 43*43*256 21*21*256 16*16*4096 16*16*4096 16*16*2 33*33*2 33*33*2 67*67*2 67*67*2 543*543*2 543*543*2 512*512*2
During the process of segmenting liver lesions, the sizes of the lesion areas are different, and a training set consisting of positive samples is not sufficient. Based on the FCN network structure, the training of the network may appear not to converge after thousands of iterations based on small datasets. Chen et al. [30] used transferred convolution neural networks (T-CNNs) for standard plane localization in fetal ultrasound via domain-transferred deep neural networks by using natural-image datasets to train an initial network model. Then, they used this model for network-parameter adjustment and achieved very good results. Inspired by the above method, we first trained the model for liver segmentation. It is easier to train the model to detect this region because the liver area accounts for a large proportion of each image used in the Pre-Dataset. This process is called pre-training. Then, the model parameters were used as initialization parameters of the lesion segmentation. Based on the initialized parameters, the model can more easily achieve convergence through the fine tuning of these parameters. We use a well-trained model for the segmentation of liver lesions. The image to be segmented is input into the trained model. Finally, the model solves a two-class problem, and the outputs are two probability maps, which characterize the probability that each pixel belongs to a certain category. In Fig. 8(a) and (b) shows the two probability maps obtained from the segmentation model. By comparing the probabilities at the position corresponding to each pixel, it can be determined whether that pixel belongs to a lesion. 3. Experimental The liver CT images of different datasets were used to train the FCN and MC-FCN models. Then, the trained models were used to segment the liver lesions. 3.1. Data preparation and experimental environment To test the validity of our segmentation method, two datasets, i.e., 3D-IRCADb [31] (3D Image Reconstruction for Comparison of
Please cite this article in press as: Sun C, et al. Automatic segmentation of liver tumors from multiphase contrast-enhanced CT images based on FCNs. Artif Intell Med (2017), http://dx.doi.org/10.1016/j.artmed.2017.03.008
G Model ARTMED-1514; No. of Pages 9
ARTICLE IN PRESS
6
C. Sun et al. / Artificial Intelligence in Medicine xxx (2017) xxx–xxx
Fig. 8. (a) shows the probability of belonging to a lesion; (b) shows the probability of belonging to the background.
Algorithm Database) and JDRD (from the Department of Radiology, the First Hospital of Jilin University), were used in this experiment for training and testing. 3D-IRCADb includes anonymized venous phase-enhanced CT images of patients who were scanned in various European hospitals with different CT scanners. This dataset contains 20 sets of images, each containing a single or multiple lesions, which can be used to compare the segmentation effects of different algorithms for tumors, in particular, for hepatic tumors. JDRD was retrospectively collected by the First Hospital of Jilin University. The multi-phase CT images used in the JDRD dataset were derived using a GE ProSpeed CT machine with a slice-plane resolution of 512 * 512. The manual segmentation of liver tumors was performed by multiple experienced radiologists for 3D-IRCADb. The gold standard for segmentation is that each lesion corresponds to a lesion map. The JDRD dataset was labeled by two radiologists at The First Hospital of Jilin University. The gold standard is that the same slices of the arterial-phase CECT image, the portal venous-phase CECT image, and the delayed-phase CECT image correspond to the same lesion location. Examples of labeled images are shown in Fig. 9. The image data are divided into the following groups for different functions: Pre-Dataset: This group was used to train the liversegmentation model using single-channel FCN, whose parameters were used as initial parameters of the liver-lesion segmentation model. 3D-IRCADb, JDRD-PV, JDRD-A, JDRD-DL: These sets of data were used to train the single-channel FCN structure to generate segmentation models. For JDRD, we named the dataset of the ART phase JDRD-A, the dataset of the PV phase JDRD-PV and the dataset of the DL phase JDRD-DL. JDRD-H: This group was used to train the liver tumor segmentation model using MC-FCN. Three sets of the ART, PV, and DL phases corresponding to each plain CT are stored in three dimensions of the dataset. Table 3 presents the detailed information on the dataset classification, training, and testing for different enforcement periods. As the experimental environment, a computer running the Linux Ubuntu 14.04 LST 64-bit operating system with an Intel i3 3.6 GHz CPU (Intel Core i3-4160), 16 GB of memory and a NVIDIA GeForce GTX 960 graphics card was used. 3.2. Evaluation measures To evaluate the effectiveness of the MC-FCN model in improving the accuracy of liver lesion segmentation, we consider the
following five error measures [32]: volumetric overlap error (VOE), relative volume difference (RVD), average symmetric surface distance (ASD), root mean square symmetric surface distance (RMSD) and maximum symmetric surface distance (MSSD). In the formulas below, A represents the liver lesions region in the segmentation result, B represents the reference region, and S represents an arbitrary surface voxel point. d represents the Euclidean distance. In the calculation formulas, S(A) and S(B) represent the areas of the liver lesion region in A and B, respectively. VOE = 100 (1 − (|A ∩ B|)/|A ∪ B|))
(1)
A VOE value of 0% indicates perfect segmentation. If the segmentation result and the reference do not overlap at all, the value is 100%. RVD = 100 × ((|A| − |B|)/|B|))
(2)
An RVD of 0% means that the volumes of A and B are identical. The advantage of the RVD is that it provides volumetric information, which is important for lesion segmentation.
⎛
ASD(A, B) =
1 |S(A)| + |S(B)|
⎝
d(sA , S(B)) +
sA ∈ S(A)
⎞
d(sB , S(A))⎠
sB ∈ S(B)
(3) The ASD is given in millimeters. It is based on the surface voxels of A and B. For each surface voxel of A, the Euclidean distance to the closest surface voxel of B is calculated using the approximate nearest neighbor technique and then stored. For symmetry, the same process is repeated from the surface voxels of B to those of A. The average symmetric surface distance is then calculated as the mean value of all stored distances, for which a value of 0 mm indicates perfect segmentation.
RMSD(A, B) =
1 × |S(A)| + |S(B)|
d2 (sA , S(B)) +
sA ∈ S(A)
d2 (sB , S(A))
(4)
sB ∈ S(B)
The RMSD is also given in millimeters, and it is also based on the surface voxels of A and B. The calculation process is similar to that of the ASD, except that the mean square error should be calculated. MSSD = max{ max d2 (sA , S(B)), max d2 (sB , S(A))} SA ∈ S(A)
SB ∈ S(B)
(5)
Please cite this article in press as: Sun C, et al. Automatic segmentation of liver tumors from multiphase contrast-enhanced CT images based on FCNs. Artif Intell Med (2017), http://dx.doi.org/10.1016/j.artmed.2017.03.008
G Model
ARTICLE IN PRESS
ARTMED-1514; No. of Pages 9
C. Sun et al. / Artificial Intelligence in Medicine xxx (2017) xxx–xxx
7
Fig. 9. Lesions are marked with green lines. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Table 3 Detailed information on the dataset. Dataset
Source & Quantity
Pre-Dataset 3D-IRCADb JDRD-PV JDRD-A JDRD-DL JDRD-H
3D-IRCADb
JDRD
Train
Test
2785 400
1024
3809 360 220 220 220 220*3
40 36 36 36 36*3
256 256 256 256*3
Phase
Network Structure
Model Name
Hybrid Venous PV ART DL Multiple
FCN FCN FCN FCN FCN MC-FCN
Pre-FCN FCN PV-FCN A-FCN D-FCN MC-FCN
The MSSD also is given in millimeters and is based on the surface voxels of A and B. It is calculated based on the maximum Euclidean distances. For perfect segmentation, this distance is 0 mm.
(d), are FCN segmentation results for the 3D-IRCADb dataset, and the images are shown in (e) and (f) are the results of MC-FCN segmentation for the JDRD-H dataset.
3.3. Run time
Fig. 10 shows examples of segmentation results. The red contours indicate the segmentations generated by our model, whereas the manual segmentations generated by experts are indicated by the green contours. The first four images, shown in (a) (b) (c) and
3.4.1. Segmentation results under different fusion parameters In Section 2.1, we propose a method of improving the segmentation accuracy by adding shallow location information. This method is achieved through the fusion of deconvolution layers and shallow layers. To prove the effectiveness of this method, we used models with different fusion structures for lesion segmentation on 3D-IRCADb, and the results are shown in Fig. 10. First, we used a structure that only restores the image size through three deconvolution layers, D1, D2, and D3, called the no fusion structure. Then, we added one fusion layer, formed by fusing layers D1 and C4, to obtain a structure with one fusion layer. We trained the models using the parameters shown in Table 1 and compared the results, as shown in Table 5. It can be seen from the experimental results that the segmentation accuracy is improved by adding fusion layers. With the addition of fusion layer F1, the VOE, RVD, ASD, RMSD, and MSSD were increased to 16.9 ± 5.2%, 7.8 ± 7.5%, 4.1 ± 3.6%, 5.5 ± 6.1 mm, and 15.4 ± 16.7 mm from 20.7 ± 16.3%, 13.5 ± 12.8%, 7.4 ± 5.3%, 10.6 ± 5.4 mm, and 25.1 ± 17.2 mm, respectively. With two fusion
Table 4 Detailed times for training and testing.
Table 5 Performance under different fusion parameters.
For the FCN training process, the training time is positively correlated with the number of iterations. In the same experimental environment, as the number of iterations increases, the accuracy of the model segmentation also increases until the model converges. The number of iterations, the training time, and the average segmentation time for each piece for the models considered in this paper are shown in Table 4. From Table 4, we can see that although the training of the model takes some time, segmentation using the trained model takes only a few seconds. 3.4. Experimental results and discussions
Model FCN MC-FCN
Number of iterations
Training time (h)
Average segmentation time (s/piece)
40000 30000
18 7
1 1.8
VOE(%)
RVD(%)
ASD(mm) RMSD(mm) MSSD(mm)
No fusion 20.7 ± 16.3 13.5 ± 12.8 7.4 ± 5.3 16.9 ± 5.2 7.8 ± 7.5 4.1 ± 3.6 D1 + C4 5.8 ± 3.5 2.0 ± 0.9 D1 + C4, D2 + S1 15.6 ± 4.3
10.6 ± 5.4 5.5 ± 6.1 2.9 ± 1.5
25.1 ± 17.2 15.4 ± 16.7 7.1 ± 6.2
Please cite this article in press as: Sun C, et al. Automatic segmentation of liver tumors from multiphase contrast-enhanced CT images based on FCNs. Artif Intell Med (2017), http://dx.doi.org/10.1016/j.artmed.2017.03.008
G Model ARTMED-1514; No. of Pages 9
ARTICLE IN PRESS
8
C. Sun et al. / Artificial Intelligence in Medicine xxx (2017) xxx–xxx
Fig. 10. Examples of segmentation results.
Table 6 Segmentation performances on 3D-IRCADb using different methods.
LLC GACD FCN
Table 7 Segmentation performances on JDRD using the FCN and MC-FCN.
VOE (%)
RVD (%)
ASD (mm)
RMSD (mm)
MSSD (mm)
14.4 ± 5.3 38.2 ± 7.0 15.6 ± 4.3
8.1 ± 2.1 12.2 ± 8.3 5.8 ± 3.5
2.4 ± 0.8 10.7 ± 2.3 2.0 ± 0.9
2.9 ± 0.7 11.4 ± 1.2 2.9 ± 1.5
7.2 ± 3.1 14.3 ± 3.7 7.1 ± 6.2
layers, the VOE, RVD, ASD, RMSD, and MSSD reached their highest levels of 15.6 ± 4.3%, 5.8 ± 3.5%, 2.0 ± 0.9%, 2.9 ± 1.5 mm, and 7.1 ± 6.2 mm, respectively. The fusion layers contain location information, which, to a certain extent, makes up for the loss of location information caused by the multilayer convolution and improves the segmentation accuracy of the FCN. 3.4.2. Comparison with other methods The segmentation performance achieved on 3D-IRCADb using the FCN model is shown in Table 6. For comparison, we also present the segmentation results of other methods, including the likelihood and local constraint (LLC) level set model [15] and the geodesic level set model (GACD) with distance regularized level set evolution (DRLSE) [33]. Table 6 shows that the FCN achieves good segmentation performance for the 3D-IRCADb database and outperforms the other two algorithms in terms of the RVD and ASD, demonstrating the applicability of the FCN for the segmentation of liver lesions. The RMSD and MSSD values of the FCN results are 2.9 ± 1.5 mm and 7.1 ± 6.2 mm, respectively. These values are close to those for the LLC method but exceed those of GACD, indicating that the FCN demonstrates stability similar to that of other methods while also offering improved segmentation accuracy.
PV-FCN A-FCN DL-FCN MC-FCN
VOE (%)
RVD (%)
ASD (mm)
RMSD (mm)
MSSD (mm)
19.1 ± 13.0 9.7 ± 7.4 17.0 ± 10.5 8.1 ± 4.5
6.0 ± 5.1 2.0 ± 1.0 9.0 ± 9.1 1.7 ± 1.0
3.6 ± 0.9 1.8 ± 0.7 3.0 ± 1.6 1.5 ± 0.7
4.6 ± 3.7 2.2 ± 2.0 3.8 ± 0.5 2.0 ± 1.2
8.2 ± 9.9 5.6 ± 6.5 6.2 ± 7.6 5.2 ± 6.4
3.4.3. Segmentation results of the MC-FCN Because it is difficult to find public datasets of CECT images, we compared the experimental results achieved using the MC-FCN and the FCN on the JDRD dataset. We trained the segmentation model on the JDRD-H training set using the MC-FCN, and we compared the segmentation results with those of models trained on the JDRDPV, JDRD-A, and JDRD-DL training sets using the FCN, as shown in Table 7. It can be seen from this table that the A-FCN model (trained on arterial-phase images) has a higher segmentation accuracy than the PV-FCN and DL-FCN models, but the MC-FCN model yields the highest values. Table 7, therefore, demonstrates that the feature fusion achieved by training on different-enhancement CT images in multiple channels using the MC-FCN improves the accuracy of segmentation. 4. Conclusion In this paper, we propose a liver lesion segmentation algorithm based on the FCN model. The original FCN model already achieves good results for the segmentation of liver lesions. However, we have also improved the FCN model by proposing the MC-FCN, which can be trained on different enhancement phases of CT images in
Please cite this article in press as: Sun C, et al. Automatic segmentation of liver tumors from multiphase contrast-enhanced CT images based on FCNs. Artif Intell Med (2017), http://dx.doi.org/10.1016/j.artmed.2017.03.008
G Model ARTMED-1514; No. of Pages 9
ARTICLE IN PRESS C. Sun et al. / Artificial Intelligence in Medicine xxx (2017) xxx–xxx
multiple channels to merge the CT features present in different enhancement phases. This method can make full use of the characteristics of different enhancement phases of CT images. We have validated our approach using clinical datasets and manual delineation performed by clinical experts as the reference. The results showed that our method has better segmentation accuracy than the method using single-enhancement-phase CT images. References [1] Haugen AS, Softeland E, Almeland SK, Sevdalis N, Vonen B, Eide GE, et al. Effect of the world health organization checklist on patient outcomes: a stepped wedge cluster randomized controlled trial. Ann Surg 2015;261(5):821–8. [2] Macedo SM, Guimaraes TA, Feltenberger JD, Santos SHS, et al. The role of renin- angiotensin system modulation on treatment and prevention of liver diseases. Peptides 2014;62:189–96. [3] Scheuermann JR. Development of solid-state avalanche amorphous selenium for medical imaging. Med Phys 2015;42(3):1223–6. [4] Seo CW, Cha BK, Kim RK, Kim CR, Yang K, Huh Y, et al. Development and operation of a prototype cone-beam computed tomography system for X-ray medical imaging. J Korean Phys Soc 2014;64(1):129–34. [5] Jeongjin L, Kyoung WK, So YK, Juneseuk S, Kyung JP, Hyung JW, et al. Automatic detection method of hepatocellular carcinomas using the non-rigid registration method of multi-phase liver CT images. J X-Ray Sci Technol 2015;23(3):275–88. [6] Kayaaltı Ö, Aksebzeci BH, Karahan Ö, Deniz K, Öztürk M, et al. Liver fibrosis staging using CT image texture analysis and soft computing. Appl Soft Comput 2014;25(C):399–413. [7] Tewari SO, Petre EN, Osborne J, Sofocleous CT. Cholecystokinin-assisted hydrodissection of the gallbladder fossa during FDG PET/CT-guided liver ablation. Cardiovasc Intervent Radiol 2013;36(6):1704–6. [8] Hao XJ, Li JP, Jiang HJ, Li DQ, Ling ZS, Xue LM, et al. CT assessment of liver hemodynamics in patients with hepatocellular carcinoma after argon-helium cryoablation. Hepatobiliary Pancreat Dis Int 2013;12(6):617–21. [9] Zanaty EA, Ghoniemy S. Medical image segmentation techniques: an overview. Int J Inf Med Data Process 2016;1(1):16–37. [10] Moltz JH, Bornemann L, Dicken V, Peitgen HO. Segmentation of liver metastases in CT scans by adaptive thresholding and morphological processing. MICCAI workshop on 3D segmentation in the clinic: a grand challenge II 2008. [11] Cao L, Udupa JK, Odhner D, Huang L, Tong Y, Torigian DA. A general approach to liver lesion segmentation in CT images. Proc SPIE 2016;9786(978623):1–7. [12] Yan J, Schwartz LH, Zhao B. Semiautomatic segmentation of liver metastases on volumetric CT images. Med Phys 2015;42(11):6283–93. [13] Xu C, Prince JL. Snakes: shapes, and gradient vector flow. IEEE Trans Image Process 1998;7(3):359–69. [14] Massoptier L, Casciaro S. A new fully automatic and robust algorithm for fast segmentation of liver tissue and tumors from CT scans. Eur Radiol 2008;18(8):1658–65. [15] Li C, Wang X, Eberl S, Fulham M, Yin Y, Chen J, et al. A likelihood and local constraint level set model for liver tumor segmentation from CT volumes. IEEE Trans Biomed Eng 2013;60(10):2967–77.
9
[16] Boykov Y, Veksler O, Zabih R. Fast approximate energy minimization via graph cut. IEEE Trans Pattern Anal Mach Intell 2001;23(11):1222–39. [17] Linguraru MG, Richbourg WJ, Watt JM, Pamulapati V, Summers RM. Liver and tumor segmentation and analysis from CT of diseased patients via a generic affine invariant shape parameterization and graph cuts. Int Conf Abdominal Imag: Comput Clin Appl 2016;2011:198–206. [18] Massoptier L, Casciaro S. A new fully automatic and robust algorithm for fast segmentation of liver tissue and tumor from CT scans. Eur Radiol 2008;18(8):1658–65. [19] Huang W, Yang Y, Lin Z, Huang GB, Zhou J, Duan Y, et al. Random feature subspace ensemble based extreme learning machine for liver tumor detection and segmentation. IEEE Eng Med Biol Soc Conf 2014:4675–8. [20] Li W, Jia F, Hu Q. Automatic segmentation of liver tumor in CT images with deep convolutional neural networks. J Comp Commun 2015;03(11):146–51. [21] Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 2012;25(2):1106–14. [22] Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE 1998;86(11):2278–324. [23] Xu Y, Du J, Dai LR, Lee CH. An experimental study on speech enhancement based on deep neural networks. IEEE Signal Process Lett 2014;21(1):65–8. [24] Kang K, Wang X. Fully convolutional neural networks for crowd segmentation. Comput Sci 2014;49(1):25–30. [25] Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation. Comput Vision Pattern Recognit 2015;79(10):1337–42. [26] Ben-Cohen A, Diamant I, Klang E, Amitai M, Greenspan H. Fully convolutional network for liver segmentation and lesions detection. Deep Learn Data Label Med Appl 2016;10008:77–85. [27] Christ PF, Elshaer MEA, Ettlinger F, Tatavarty S, Bickel M, Bilic P, et al. Automatic liver and lesion segmentation in CT using cascaded fully convolutional neural networks and 3D conditional random fields. Med Image Comput Comput-Assisted Intervention MICCAI 2016;9901:415–23. [28] Ruskó L, Bekes G, Fidrich M. Automatic segmentation of the liver from multiand single-phase contrast-enhanced CT images. Med Image Anal 2009;13(6):871–82. [29] Glorot X, Bordes A, Bengio Y. Deep sparse rectifier neural networks. AISTATS 2011;15(106):275. [30] Chen H, Ni D, Qin J, Li S, Yang X, Wang T, et al. Standard plane localization in fetal ultrasound via domain transferred deep neural networks. IEEE J Biomed Health Inf 2015;19(5):1627–36. [31] Soler L, Hostettler A, Agnus V, Charnoz A, Fasquel JB, Moreau J, et al. 3D Image reconstruction for comparison of algorithm database: A patient specific anatomical and medical image database; 2017 [Online]. Available: http:// www.ircad.fr/softwares/3Dircadb/3Dircadb.php?lng=en. [32] Heimann T, Van GB, Styner MA, Arzhaeva Y, Aurich V, Bauer C, et al. Comparison and evaluation of methods for liver segmentation from CT datasets. IEEE Trans Med Imaging 2009;28(8):1251–62. [33] Li C, Xu C, Gui C, Fox DM. Distance regularized level set evolution and its application to image segmentation. IEEE Trans Image Process 2010;19(20):3243–54.
Please cite this article in press as: Sun C, et al. Automatic segmentation of liver tumors from multiphase contrast-enhanced CT images based on FCNs. Artif Intell Med (2017), http://dx.doi.org/10.1016/j.artmed.2017.03.008