Journal Pre-proofs Cascaded Deep Convolutional Encoder-Decoder Neural Networks for Efficient Liver Tumor Segmentation Ümit Budak, Yanhui Guo, Erkan Tanyildizi, Abdulkadir Şengür PII: DOI: Reference:
S0306-9877(19)30997-1 https://doi.org/10.1016/j.mehy.2019.109431 YMEHY 109431
To appear in:
Medical Hypotheses
Received Date: Revised Date: Accepted Date:
6 September 2019 30 September 2019 10 October 2019
Please cite this article as: U. Budak, Y. Guo, E. Tanyildizi, A. Şengür, Cascaded Deep Convolutional EncoderDecoder Neural Networks for Efficient Liver Tumor Segmentation, Medical Hypotheses (2019), doi: https://doi.org/ 10.1016/j.mehy.2019.109431
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
© 2019 Published by Elsevier Ltd.
Cascaded Deep Convolutional Encoder-Decoder Neural Networks for Efficient Liver Tumor Segmentation Ümit Budak1, Yanhui Guo2, Erkan Tanyildizi3 & Abdulkadir Şengür3 1
Electrical and Electronics Engineering Dept., Engineering Faculty, Bitlis Eren University, Bitlis, Turkey. 2 Department
3 Department
of Computer Science, University of Illinois at Springfield, Springfield, IL 62703, USA.
of Electrical and Electronics Engineering, Technology Faculty, Firat University, Elazig, Turkey.
Abstract Liver and hepatic tumor segmentation remains a challenging problem in Computer Tomography (CT) images analysis due to its shape variation and vague boundary. The general hypothesis says that deep learning methods produce improved results on medical image segmentation. This paper formulates the segmentation of liver tumor in CT abdominal images as a classification problem, and then solves it using a cascaded classifier framework based on deep convolutional neural networks. Two deep encoder-decoder convolutional neural networks (EDCNN) were constructed and trained to cascade segments of both the liver and lesions in CT images with limited image quantity. In other words, an EDCNN segments the liver image as the input for the training of a second EDCNN. The second EDCNN then segments the tumor regions within the liver ROI regions as predicted by the first EDCNN. Segmenting the hepatic tumor inside the liver ROI also significantly reduces falsepositives. The proposed model was then tested using a public dataset (3DIRCADb), and several metrics were used in order to quantitatively evaluate its performance. The proposed method produced an average DICE score of 95.22% for the test set of CT images. The proposed method was then compared with some of the existing methods. The experimental results demonstrated that the proposed EDCNN achieved improved performance in segmentation accuracy over some existing methods. Keywords: Cascaded Network; Convolutional Neural Network, Encoder-Decoder Network, Liver Segmentation
1. Introduction According to the statistics, liver cancer is the fifth and ninth most commonly occurring cancer type in men and women, respectively, and worldwide 840,000 new cases were reported in 2018 alone [1]. However, the number of new cases could be significantly reduced through the early detection of liver cancer. Imaging technologies such as Computerized Tomography (CT) and Magnetic Resonance (MR) are significantly important in monitoring the liver structure for potential early diagnosis and treatments of liver cancer. However, physicians generally evaluate CT and MR images manually, which is a time consuming and operator-dependent process. Researchers have intended to develop various computer-aided diagnosis (CAD) systems in order to reduce the workload of physicians in this area. Häme [2] proposed a two-staged segmentation algorithm for efficient liver tumor determination in CT images. The first stage applies thresholding and morphological operations for coarse liver segmentation. The second stage refines the coarse segmentation result by using fuzzy clustering and a geometric deformable model. Linguraru et al. [3] explored the potential use of the normalized-probabilistic-atlases for automatic liver and spleen segmentation. Li et al. [4] proposed an approach based on a unified level set method for automatic liver tumor segmentation. The boundary leakage problem was eliminated by integrating other object information within the object indication function. Huang et al. [5] proposed a liver tumor detection method based on an ensemble of extreme learning machines (ELMs). Cohen et al. [6] used fully convolutional neural networks (FCNs) for liver segmentation in CT images. Beside liver segmentation, researchers have also been able to determine liver metastases. Li et al. [7] used CNNs for automatic segmentation of liver tumor regions in CT images. A seven-layered CNN was trained for tumor segmentation task by using small patches obtained from related regions. Christ et al. [8] proposed a cascaded FCNs model with 3D conditional random fields (CRF) for both liver and tumor segmentation in CT images. CRF was used to refine the segmentation results from a cascaded-fully CNN model. Hu et al. [9] proposed a 3D CNN that used global optimization for accurate liver segmentation in CT images. The 3D CNN was used to determine the surface of the liver region in CT images, and an optimization procedure was applied based on local and global information for refining
the segmentation. Lu et al. [10] combined 3D CNNs and graph-cut methods for efficient localization of the liver region in CT images. After obtaining the probability map by using the 3D CNNs, accurate segmentation was guaranteed by the graph-cut method. Li et al. [11] proposed a hybrid-denselyconnected-Unet model for liver and tumor segmentation in CT frames. The training of the hybriddensely-connected-Unet model was achieved in the end-to-end learning manner. Vivanti et al. [12] proposed a four-staged approach for liver and tumor detection in liver CT frames. These stages were deformable registration, segmentation of the liver region, training of a CNN model, and tumor detection, respectively. Chang et al. [13] investigated the use of various CNN architectures for CAD problems. The authors investigated the effect of the scale and spatial context of the CT images on the performance of the CNN. The effort of the transfer learning in CAD problems was also investigated in the proposed work. Dou et al. [14] proposed a new 3D deeply supervised network model for automatic segmentation of the liver region in CT volumes. The proposed model employed the FCNs architecture which was trained in the end-to-end learning manner. As can be seen from the reviewed literature, the recent research trend in liver and tumor detection in CT images has been the deep learning method, with outstanding achievements having been reported. In addition, new deep learning methods have been proposed on liver and tumor segmentation practically every day. In the current study, a new deep learning approach is propose that uses cascaded Encoder-Decoder Convolutional Neural Networks (CEDCNNs) for both liver and tumor detection in CT images. As liver and tumor segmentation performance has not been high in previous methods, the intent of the current study is to improve on the existing segmentation accuracy on both liver and tumor regions. The first EDCNN is used for liver segmentation and the second for tumor detection. The study evaluates the performance of the proposed method on the 3DIRCADb dataset [15], with quantitative evaluations of the obtained results calculated based on the Dice Score (DICE), Volumetric Overlap Error (VOE), Relative Volume Difference (RVD), Average Symmetric Surface Distance (ASSD), and Maximum Surface Distance (MSD), respectively.
The remainder of the paper is structured as follows: Section 2 describes the proposed model, then Section 3 provides the dataset description, the experimental results, and the discussion, with conclusions then drawn in Section 4. 2. Proposed Method 2.1. Hypothesis Recent studies have reported on deep learning methods producing outstanding achievements on medical image segmentation. Especially when the results of traditional segmentation approaches are compared with the deep learning approaches on liver segmentation, the improvement is clear. Thus in the current study, a new deep learning approach is proposed for efficient liver and tumor segmentation by using cascaded Encoder-Decoder Convolutional Neural Networks (CEDCNNs). As liver and tumor segmentation performance has not been high in previous methods, the current study intends to improve on the segmentation accuracy of both liver and tumor regions. 2.2. Preprocessing Several preprocessing steps were adapted in order to make the CT slices more convenient prior to segmentation with CEDCNN, as performed in [8]. The Hounsfield unit values are windowed in the range [−100, 400] so as to exclude irrelevant organs and objects which increase image contrast with the liver organ. Histogram equalization is applied after Hounsfield windowing. Figure 1 shows liver CT images both before and after the preprocessing steps. As can be seen in Figure 1, the contrast within the liver region is enhanced.
(a)
(b)
(c)
Fig. 1. Example image with preprocessing steps; a) raw CT slice, b) HU-windowed image in range of [-100 400], c) image after histogram equalization
2.3. Proposed CEDCNN A flowchart of the proposed approach is presented as Figure 2. As can be seen in Figure 2, the input CT slices are initially preprocessed for the purpose of contrast enhancement. The preprocessing steps significantly increase the contrast of the CT slices and the liver tissues become more visible as a result, which is of importance to the subsequent processes. As the raw CT slices contain not only liver but also tissues and other organs, the preprocessing process commences with the removal of some irrelevant regions whose intensities are different from that of the liver. The HU-windowing approach, as previously applied in [8], is then performed in this preprocessing stage. Histogram equalization is then employed after the HU process in order to increase the image contrast. After these preprocessing steps, the image is then used as input to the first CNN.
Training Set
Testing Set Preprocessing
Encoder-Decoder CNN (for Liver segmentation)
Extract Liver ROI
Encoder-Decoder CNN (for Tumor segmentation)
Fig. 2. Flowchart of proposed approach
The EDCNN structure consists of two parts (see Figure 3). The first part is termed the encoder-network and the second as the decoder-network. The encoder and decoder network parts construct a symmetrical structure and the final part is termed as the pixel-wise classification layer. The proposed architecture is very similar to that of the SegNet model [16]. In classical SegNet architecture, the encoder-network part uses the first 13 convolutional layers of the VGG16 model [17]. As the decoder-network has symmetry with the encoder-network, it also consists of 13 convolutional layers. The output of the decoder-network is fed to the softmax classifier function in order to produce class probabilities for each pixel. A filter bank is then used in convolution operation to extract feature maps
which are then batch normalized. An element-wise rectified linear non-linearity (ReLU) unit is employed following the batch normalization. Finally, max pooling is applied to the ReLU, followed by sub-sampling. In the max-pooling, a 2×2 window and stride 2 are performed. Max pooling and subsampling are applied in order to ensure translation invariance over small spatial shifts in the input image. Similar to the encoder-network, the decoder-network contains decoders which are responsible for up-sampling the input feature maps. A convolution operation is then applied to the up-sampling where a trainable decoder filter bank is used to produce dense feature maps. Batch normalization is also employed to each of these feature maps. The last decoder in the decoder-network produces a multi-channel feature map which corresponds to the input image. This multi-channel feature map is then fed to the softmax classifier function in order to label the pixels of the input images.
Pooling Indices
Pooling Upsampling Softmax Convolution+Batch normalisation+ReLu
Fig. 3. Proposed encoder-decoder network architecture
In addition to these, the SegNet has layers with an encoder/decoder depth of five. Two convolution + batch normalization + ReLU layers are employed at the first and second depths, and three at the remaining depths. Each encoder/decoder depth uses 64, 128, 256, 512, and 512 convolution filters, respectively.
Table 1. Structure and configuration of each EDCNN model Depth 1
Encoder
2 3 4 5 5
Decoder
4 3 2 1 -
-
Layer 2×(Conv.+Batch Norm.+ReLU) Max-pooling 2×(Conv.+Batch Norm.+ReLU) Max-pooling 2×(Conv.+Batch Norm.+ReLU) Max-pooling 2×(Conv.+Batch Norm.+ReLU) Max-pooling 2×(Conv.+Batch Norm.+ReLU) Max-pooling Max-unpooling 2×(Conv.+Batch Norm.+ReLU) Max-unpooling 2×(Conv.+Batch Norm.+ReLU) Max-unpooling 2×(Conv.+Batch Norm.+ReLU) Max-unpooling 2×(Conv.+Batch Norm.+ReLU) Max-unpooling 2×(Conv.+Batch Norm.+ReLU) Softmax Pixel Classification
Parameter 3×3×64 conv. filter, stride; 1, padding; 1 window size; 2×2, stride; 2 3×3×64 conv. filter, stride; 1, padding; 1 window size; 2×2, stride; 2 3×3×64 conv. filter, stride; 1, padding; 1 window size; 2×2, stride; 2 3×3×64 conv. filter, stride; 1, padding; 1 window size; 2×2, stride; 2 3×3×64 conv. filter, stride; 1, padding; 1 window size; 2×2, stride; 2 3×3×64 conv. filter, stride; 1, padding; 1 3×3×64 conv. filter, stride; 1, padding; 1 3×3×64 conv. filter, stride; 1, padding; 1 3×3×64 conv. filter, stride; 1, padding; 1 3×3×64 conv. filter, stride; 1, padding; 1
Table 1 shows the details of each EDCNN structure constructed for the proposed method. Unlike the SegNet model, the proposed model employed two convolution + batch normalization + ReLU layers at each of the encoder/decoder depths. Moreover, the number of filters used in each convolution layer is 64, which greatly reduces the number of learnable parameters from about 30M to 0.7M in the proposed network when compared to the SegNet architecture. This approach ensures that the network can be trained in a GPU with limited memory and processing capacity. It was also used to investigate whether or not a small number of filters were sufficient to segment the liver images with a high degree of accuracy. The current study used two identical CNN models. The first was employed for the purposes of liver segmentation from input images obtained as a result of the preprocessing steps. The second CNN model was used to detect tumors in the predicted liver regions.
3. Experimental Setup and Results 3.1 Dataset The proposed method was evaluated on the 3DIRCADb dataset (3D Image Reconstruction for Comparison of Algorithm Database) [18], which contains a total of 20 CT-scans, of which 75% have hepatic tumors. A group of experts at the University Hospital in Strasbourg, France (Centre Hospitalier et Universitaire), also provided 3D medical images and segmented mask structures as DICOM files. Detailed information about the dataset is given in Table 2. The size of each CT slice is 512×512 pixels; the pixel width and height vary from 0.56 to 0.87 mm, with thickness between two slices varying from 1.25 to 4 mm, and the slice number from 74 to 260 for different image sets. The 3DIRCADb dataset is considered challenging to use as it offers increased variety and complexity of livers and their tumors. 15 CT volumes were used in training the proposed 2D EDCNN approach, with the remaining five CT volumes used for testing purposes. Table 2. Technical information about the 3DIRCADb dataset CT number
Gender
Voxel size W×H×D (mm)
Size + number of slices
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
F F M M M M M F M F M F M F F M M F F F
0.57×0.57×1.6 0.78×0.78×1.6 0.62×0.62×1.25 0.74×0.74×2.0 0.78×0.78×1.6 0.78×0.78×1.6 0.78×0.78×1.6 0.56×0.56×1.6 0.87×0.87×2.0 0.73×0.73×1.6 0.72×0.72×1.6 0.68×0.68×1.0 0.67×0.67×1.6 0.72×0.72×1.6 0.78×0.78×1.6 0.70×0.70×1.6 0.74×0.74×1.6 0.74×0.74×2.5 0.70×0.70×4.0 0.81×0.81×2.0
512×512×129 512×512×172 512×512×200 512×512×91 512×512×139 512×512×135 512×512×151 512×512×124 512×512×111 512×512×122 512×512×132 512×512×260 512×512×122 512×512×113 512×512×125 512×512×155 512×512×119 512×512×74 512×512×124 512×512×225
Liver pathology 7 tumors 1 tumor 1 tumor 7 tumors 0 tumors 20 tumors 0 tumors 3 tumors 1 tumor 8 tumors 0 tumors 1 tumor 20 tumors 0 tumors 2 tumors 1 tumor 2 tumors 1 tumor 46 tumors 0 tumors
Usage Testing Training Training Testing Testing Training Training Training Testing Training Training Training Training Training Training Training Testing Training Training Training
3.2 Evaluation Criterion To evaluate the segmentation quality of a set of machine-segmented voxels, the ground truths created by the experts were used. There are different quantitate metrics in accordance with this evaluation purpose. The most frequently used among them are volumetric overlap and surface distances, which are different mathematical statements related to similarity and distance. More criterions might be preferable in the segmentation result evaluation. For example, in the case of measuring liver volume, an evaluation of volumetric error may be preferred according to surface distance criterion, because accurate volume positioning is the primary objective. However, it should be used in a variety of different criteria for a general evaluation of the quality of segmentation. For instance, the segmentation result may be very similar to the manually segmented reference for the largest part of the organ, but an excessive deviation may be observed within a small localized area. In this condition, the maximum distance error will be high, while the average distance error is low. Because of these reasons, in order to evaluate the quality of the segmentation, five metrics were used, and these are described in detail as follows: Dice Score (DICE): The DICE similarity coefficient or F1 measure is one of the most commonly applied assessment measures in image segmentation [19, 20]. All metrics are applied to binary volumes (mostly named as voxels) as shown in Figure 4.
Fig. 4. Demonstration of true positive (TP), false positive (FP) and false negative (FN) voxels in comparison of machine segmentation (MS) results and ground truth (GT)
Let MS be the predicted object and GT be the ground truth object. In this case, the DICE similarity coefficient is calculated as:
𝐷𝐼𝐶𝐸(𝐺𝑇,𝑀𝑆) =
2 × |𝐺𝑇⋂𝑀𝑆| 2 × 𝑇𝑃 = |𝐺𝑇| + |𝑀𝑆| 2 × 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁
where the DICE score should be in the range of [0,1] and the highest segmentation score is 1. Volumetric Overlap Error (VOE): VOE is derived from the Jaccard coefficient and can be defined as:
𝑉𝑂𝐸(𝐺𝑇,𝑀𝑆) = 1 ―
|𝐺𝑇⋂𝑀𝑆| 𝑇𝑃 =1― 𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 |𝐺𝑇⋃𝑀𝑆|
Relative Volume Difference (RVD): This criterion is a non-symmetric measure of two segmented volumes and can be defined as:
𝑅𝑉𝐷(𝐺𝑇,𝑀𝑆) = 1 ―
|𝑀𝑆| ― |𝐺𝑇| 𝐹𝑃 =1― |𝐺𝑇| 𝑇𝑃 + 𝐹𝑁
Average Symmetric Surface Distance (ASSD): Let S(GT) be the set of slice voxels of GT. The closest distance of any voxel v to S(GT) can be calculated as: 𝑑(𝑣,𝑆(𝐺𝑇)) =
min 𝑆𝐺𝑇 ∈ 𝑆(𝐺𝑇)
𝑑(𝑣,𝑆𝐺𝑇)
where 𝑑(𝑣,𝑆𝐺𝑇) is the Euclidean distance of the voxels involving the real spatial resolution of the slices. In this case, the average symmetric surface distance can be defined as:
𝐴𝑆𝑆𝐷(𝐺𝑇,𝑀𝑆) =
1 |𝑆(𝐺𝑇)| + |𝑆(𝑀𝑆)|
(
∑ 𝑆𝐺𝑇 ∈ 𝑆(𝐺𝑇)
𝑑(𝑆𝐺𝑇,𝑆(𝑀𝑆)) +
∑
𝑑(𝑆𝑀𝑆,𝑆(𝐺𝑇))
𝑆𝑀𝑆 ∈ 𝑆(𝑀𝑆)
)
Maximum Surface Distance (MSD): This metric, known as Hausdorff distance, determines the maximum distance of the nearest points between two set of voxels, instead of the average, and can be defined as: 𝑀𝑆𝐷(𝐺𝑇,𝑀𝑆) = 𝑚𝑎𝑥 {𝑑(𝑆𝐺𝑇,𝑆(𝑀𝑆)), 𝑑(𝑆𝑀𝑆,𝑆(𝐺𝑇))} 3.3 Experimental Results The 3DIRCADb dataset was used during the study’s experimental works [18]. This dataset consists of 20 sets of CT slices, with each set having different voxel sizes in the real spatial resolution. A randomly selection of 15 sets (2,234 slices) were used for training purposes, and the remaining five sets (589 slices) were used for testing (see Table 1 for more detailed information). In addition, no data augmentation strategy was performed during the CEDCNN training. All experiments were configured with a server using a NVIDIA Quadro M6000 GPU, 24 GB memory, and with 3072 CUDA cores. In the training process, stochastic gradient descent was employed as an optimizer. The learning rate was gradually decreased from 0.1 to 0.00625, with a 0.5 drop factor every 20 epochs. The network was trained for 100 epochs with a batch size of 6. The output of the first CNN consisted of predicted liver regions as binary, and these outputs were then used as the input to the second CNN by extracting the ROIs. The same parameters were used in the second CNN. Figure 5 presents some of the segmentation results. As can be seen in Figure 5, the blue and green regions depict liver and tumor regions, respectively. The red regions located at the edge of the liver region show the errors in liver segmentation based on ground-truth liver segmentation. Similarly, the yellow regions located around the tumor regions show errors in the tumor segmentation based on ground-truth tumor segmentation.
Fig. 5. Demonstration of confusion matrix on CT slices. Blue and green show correctly predicted TP liver and its tumor segmentation results, red and yellow are wrongly predicted voxels (FP and FN) for liver and its tumor segmentation results, respectively
Furthermore, the liver segmentation results were evaluated for each test set using VOE, RVD, ASSD, MSD, and DICE criteria. Quantitative results are presented in Table 3. The first column of Table 3 shows the CT numbers used in the test set, whilst the other columns show the evaluation metrics used in evaluating the proposed method. As seen in the last column of Table 3, the proposed method produced various DICE scores, ranging from 93.35% to 98.82%. The highest DICE score was produced for Test Set 01 where a 98.82% DICE score was obtained, and the lowest DICE score was for Test Set 17 where the DICE score was 93.35%. The calculated VOE, RVD, ASSD, and MSD values on some of the test CT sets are presented in Table 3. The produced VOE, RVD, ASSD, and MSD values were also proportional to the DICE scores. The calculated VOE, RVD, ASSD, and MSD scores were in the ranges of [2.32-12.48], [2.18-12.29], [0.52-1.86], and [7.24-33.26], respectively. The last row of Table 3 shows the average evaluation scores. The average VOE, RVD, ASSD, MSD, and DICE scores were 9.05%, 7.03%, 1.43%, 19.37%, and 95.22%, respectively.
Table 3. Quantitative evaluation of liver segmentation results Test CTs 01 04 05 09 17 Average
VOE (%) 2.32 12.21 7.28 10.97 12.48 9.05
RVD (%) 2.18 12.29 4.20 8.53 7.95 7.03
ASSD (mm) 0.52 1.58 1.86 1.45 1.76 1.43
MSD (mm) 7.24 26.49 33.26 10.24 19.62 19.37
DICE (%) 98.82 93.50 96.22 94.20 93.35 95.22
The proposed method’s achievement on liver segmentation was compared with various previously published methods (see Table 4). The cases shown in bold typeface in Table 4 depict the highest levels of achievement. As can be seen in Table 4, according to the ASSD, MSD, and DICE scores, the proposed method outperformed the other methods. The DICE score for the proposed method is 0.72% and 0.92% better than the DICE scores of Li et al. [23] and Christ et al. [8], respectively. In addition, based on the RVD scores, Christ et al. [8] produced the highest achievement; whereas, based on the VOE score, Chartrand et al. [22]’s achievement was the best with a calculated VOE score of 6.8%. Table 4. Comparison of proposed method and previous works for liver segmentation results (highest results marked as bold) Method Christ et al. [8] Li et al. [21] Chartrand et al. [22] Li et al. [23] Shi et al. [24] Huang et al. [25] Proposed
VOE (%)
RVD (%)
ASSD (mm)
MSD (mm)
DICE (%)
10.7 9.2 6.8 8.74 7.84 9.05
-1.4 -11.2 1.7 2.41 3.42 7.03
1.5 1.6 1.6 1.45 1.97 1.43
24.0 28.2 24 26.91 37.05 19.37
94.3 94.5 95.22
As the main evaluation criterion of experts is known as volumetric overlap, the 3D surface rendering results of each test set are also presented for both the predicted and manually segmented slices, as can be seen in Figure 6. The 3D segmentation results with real spatial resolution were created using the “Medical Image Processing Toolbox” [26] in the MATLAB environment, and the ParaView open-source platform was used for 3D visualization [27]. The first row of Figure 6 shows the surface volumes that were manually segmented by the experts, whilst the second row shows the
predicted surface volumes. It is clear that the proposed method predicted surface volumes that were quite similar to the liver region marked by the expert physicians.
Fig. 6. 3D visualization of liver surfaces for GT (first row) and MS (second row)
It is worth mentioning here the tumor segmentation achievements of this experiment. Tumor segmentation can be quite challenging when compared to liver segmentation, as also mentioned in [8]. The proposed method produced an average DICE score of 63.4% for tumor segmentation. This score is considered quite low compared to the average DICE score of 95.22% for liver segmentation. The current study also compared the tumor segmentation achievement of the proposed method with the achievement of Christ et al. [8], as shown in Table 5. Table 5. Comparison of tumor segmentation achievements (best results marked as bold) Method Christ et al. [8] Proposed
Average DICE (%) 61% ± 25% 64.3% ± 34.6%
As can be seen from Table 5, the segmentation achievement on tumor regions for the proposed method was better than the achievement obtained in [8]. Therefore, the performance level of the proposed method is on average 3.3% better than the compared method. 4. Conclusions Recent advances have made it possible to obtain a preoperative 3D model of patients, a kind of digital clone of the real patient. This can be achieved by transferring medical information obtained
from scan images to 3D modeling software. This type of data can be used to preoperatively guide surgeons during the actual operation, and also postoperatively for anatomical training or the simulation of medical applications. It also allows researchers to evaluate the performance of developed segmentation methods. The current study proposed two encoder-decoder convolutional neural networks in order to segment the liver and tumors in CT images. The EDCNN was constructed using an encoder-network, a decoder-network with a symmetric structure to encoder-network, and a pixel-wise classification layer. While the first EDCNN segments the livers in CT slices, the second uses the results produced by the first EDCNN for the segmentation of hepatic tumors. The CEDCNN model was trained and tested using the 3DIRCADb dataset and several metrics used so as to evaluate the segmentation accuracy quantitatively, such as DICE, VOE, RVD, ASSD, and MSD. The experimental results were compared with some of the recently published results, which demonstrated that the proposed CEDCNN achieves better segmentation accuracy than other existing methods for the task of liver segmentation. Finally, if self-criticism was to be made, the researchers note that the current study was unable to achieve the desired performance against hepatic tumors, but believe that the proposed method is more suited to multiple organ segmentation tasks. In the future, the researchers will focus on increasing the training speed and efficiency of the EDCNN on a quantity-limited dataset. References [1] https://www.wcrf.org/dietandcancer/cancer-trends/liver-cancer-statistics, last accessed: 17.09.2018 [2] Häme, Y. (2008). Liver Tumor Segmentation Using Implicit Surface Evolution. Proceedings of the MICCAI Workshop on 3D Segmentation in the Clinic: A Grand Challenge II. [3] Linguraru, M. G., Sandberg, J. K., Li, Z., Shah, F., & Summers, R. M. (2010). Automated segmentation and quantification of liver and spleen from CT images using normalized probabilistic atlases and enhancement estimation. Medical physics, 37(2), 771-783.
[4] Li, B. N., Chui, C. K., Chang, S., & Ong, S. H. (2011). Integrating spatial fuzzy clustering with level set methods for automated medical image segmentation. Computers in biology and medicine, 41(1), 1-10. [5] Huang, W., Yang, Y., Lin, Z., Huang, G. B., Zhou, J., Duan, Y., & Xiong, W. (2014, August). Random feature subspace ensemble based extreme learning machine for liver tumor detection and segmentation. In Engineering in Medicine and Biology Society (EMBC), 2014 36th Annual International Conference of the IEEE (pp. 4675-4678). IEEE. [6] Ben-Cohen, A., Diamant, I., Klang, E., Amitai, M., & Greenspan, H. (2016). Fully convolutional network for liver segmentation and lesions detection. In Deep Learning and Data Labeling for Medical Applications (pp. 77-85). Springer, Cham. [7] Li, W., Jia, F., & Hu, Q. (2015). Automatic segmentation of liver tumor in CT images with deep convolutional neural networks. Journal of Computer and Communications, 3(11), 146. [8] Christ, P. F., Ettlinger, F., Grün, F., Elshaera, M. E. A., Lipkova, J., Schlecht, S., ... & Rempfler, M. (2017). Automatic liver and tumor segmentation of ct and mri volumes using cascaded fully convolutional neural networks. arXiv preprint arXiv:1702.05970. [9] Hu, P., Wu, F., Peng, J., Liang, P., & Kong, D. (2016). Automatic 3D liver segmentation based on deep learning and globally optimized surface evolution. Physics in Medicine & Biology, 61(24), 8676. [10] Lu, F., Wu, F., Hu, P., Peng, Z., & Kong, D. (2017). Automatic 3D liver location and segmentation via convolutional neural network and graph cut. International journal of computer assisted radiology and surgery, 12(2), 171-182. [11] Li, X., Chen, H., Qi, X., Dou, Q., Fu, C. W., & Heng, P. A. (2017). H-DenseUNet: Hybrid densely connected UNet for liver and liver tumor segmentation from CT volumes. arXiv preprint arXiv:1709.07330.
[12] Vivanti, R., Ephrat, A., Joskowicz, L., Karaaslan, O. A., Lev-Cohain, N., & Sosna, J. (2015). Automatic liver tumor segmentation in follow up CT studies using convolutional neural networks. In Proc. Patch-Based Methods in Medical Image Processing Workshop (Vol. 2). [13] Hoo-Chang, S., Roth, H. R., Gao, M., Lu, L., Xu, Z., Nogues, I., ... & Summers, R. M. (2016). Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning. IEEE transactions on medical imaging, 35(5), 1285. [14] Dou, Q., Chen, H., Jin, Y., Yu, L., Qin, J., & Heng, P. A. (2016, October). 3D deeply supervised network for automatic liver segmentation from CT volumes. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 149-157). Springer, Cham. [15] Soler, L., Hostettler, A., Agnus, V., Charnoz, A., Fasquel, J. B., Moreau, J., ... & Marescaux, J. (2010). 3D Image reconstruction for comparison of algorithm database: A patient specific anatomical and medical image database. [16] V. Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A deep convolutional encoder-decoder architecture for image segmentation,” arXiv:1511.00561, 2015. [17] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014. [18] L. Soler, A. Hostettler, V. Agnus, A. Charnoz, J. Fasquel, J. Moreau, A. Osswald, M. Bouhadjar, J. Marescaux, 3d image reconstruction for comparison of algorithm database: a patient-specific anatomical and medical image database (2012). [19] Grau, V., Mewes, A. U. J., Alcaniz, M., Kikinis, R., & Warfield, S. K. (2004). Improved watershed transform for medical image segmentation using prior information. IEEE transactions on medical imaging, 23(4), 447-458. [20] Linguraru, M. G., Pura, J. A., Pamulapati, V., & Summers, R. M. (2012). Statistical 4D graphs for multi-organ abdominal segmentation from multiphase CT. Medical image analysis, 16(4), 904-914.
[21] Li, G., Chen, X., Shi, F., Zhu, W., Tian, J., & Xiang, D. (2015). Automatic liver segmentation based on shape constraints and deformable graph cut in CT images. IEEE Transactions on Image Processing, 24(12), 5315-5329. [22] Chartrand, G., Cresson, T., Chav, R., Gotra, A., Tang, A., & DeGuise, J. (2014, April). Semiautomated liver CT segmentation using Laplacian meshes. In Biomedical Imaging (ISBI), 2014 IEEE 11th International Symposium on (pp. 641-644). IEEE. [23] Li, C., Wang, X., Eberl, S., Fulham, M., Yin, Y., Chen, J., & Feng, D. D. (2013). A likelihood and local constraint level set model for liver tumor segmentation from CT volumes. IEEE Transactions on Biomedical Engineering, 60(10), 2967-2977. [24] Shi, C., Cheng, Y., Liu, F., Wang, Y., Bai, J., & Tamura, S. (2016). A hierarchical local regionbased sparse shape composition for liver segmentation in CT scans. Pattern Recognition, 50, 88-106. [25] Lianfen Huang, Minghui Weng, Haitao Shuai, Yue Huang, Jianjun Sun, and Fenglian Gao, “Automatic Liver Segmentation from CT Images Using Single-Block Linear Detection,” BioMed Research
International,
vol.
2016,
Article
ID
9420148,
11
pages,
2016.
https://doi.org/10.1155/2016/9420148. [26]
https://www.mathworks.com/matlabcentral/fileexchange/41594-medical-image-processing-
toolbox?focused=6499081&tab=function, last accessed: 21.09.2018 [27] https://www.paraview.org/, last accessed: 21.09.2018