Accepted Manuscript
A Novel Retinal Vessel Detection Approach Based on Multiple Deep Convolution Neural Networks ¨ Yanhui Guo , Umit Budak , Abdulkadir S¸engur ¨ PII: DOI: Reference:
S0169-2607(18)30781-8 https://doi.org/10.1016/j.cmpb.2018.10.021 COMM 4809
To appear in:
Computer Methods and Programs in Biomedicine
Received date: Revised date: Accepted date:
23 May 2018 12 October 2018 29 October 2018
¨ Please cite this article as: Yanhui Guo , Umit Budak , Abdulkadir S¸engur ¨ , A Novel Retinal Vessel Detection Approach Based on Multiple Deep Convolution Neural Networks, Computer Methods and Programs in Biomedicine (2018), doi: https://doi.org/10.1016/j.cmpb.2018.10.021
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Highlights
AC
CE
PT
ED
M
AN US
CR IP T
This study formulates the retinal vessel detection task as a classification problem and solves it using a multiple classifier framework based on deep convolutional neural networks. The MDCNN is trained using an incremental learning strategy to improve the networks’ performance. The final classification results are obtained from the voting procedure on the results of MDCNN. The MDCNN achieves better performance and significantly outperforms the state-of-the-art for automatic retinal vessel segmentation on the DRIVE and STARE datasets.
ACCEPTED MANUSCRIPT
A Novel Retinal Vessel Detection Approach Based on Multiple Deep Convolution Neural Networks Yanhui Guo1, Ümit Budak2, and Abdulkadir Şengür3 1 2
Department of Computer Science, University of Illinois, Springfield, Illinois, USA
Department of Electrical-Electronics Engineering, Bitlis Eren University, Bitlis, Turkey 3
Electrical and Electronics Engineering Department, Firat University, Elazig, Turkey.
Abstract
AC
CE
PT
ED
M
AN US
CR IP T
Background and Objective: Computer aided detection (CAD) offers an efficient way to assist doctors to interpret fundus images. In a CAD system, retinal vessel (RV) detection is a crucial step to identify the retinal disease regions. However, RV detection is still a challenging problem due to variations in morphology of the vessels on noisy and low contrast fundus images. Methods: In this paper, we formulate the detection task as a classification problem and solve it using a multiple classifier framework based on deep convolutional neural networks. The multiple deep convolutional neural network (MDCNN) is constructed and trained on fundus images with limited image quantity. The MDCNN is trained using an incremental learning strategy to improve the networks’ performance. The final classification results are obtained from the voting procedure on the results of MDCNN. Results: The MDCNN achieves better performance and significantly outperforms the state-of-the-art for automatic retinal vessel segmentation on the DRIVE dataset with 95.97% and 96.13% accuracy and 0.9726 and 0.9737 AUC (area below the operator receiver character curve) score on training and testing sets, respectively. Another public dataset, STARE, is also used to evaluate the proposed network. The experimental results demonstrate that the proposed MDCNN network achieves 95.39% accuracy and 0.9539 AUC score in STARE dataset. We further compare our result with several state-of-the-art methods based on AUC values. The comparison is shown that our proposal yields the third best AUC value. Conclusions: Our method yields the better performance in the compared the state of the art methods. In addition, our proposal has no preprocessing stage, and the input color fundus images are fed into the CNN directly. Keywords: Retinal vessels segmentation; Multiple deep convolution neural network; Image segmentation.
I.
Introduction
A fundus camera is a useful tool to present a detailed view of the back of eye [1]. It
enables the ophthalmologist to screen the retina and detect the early signs of potential eye diseases such as arteriosclerosis, hypertension, cardiovascular and diabetes [1]. Manuel evaluation of fundus camera images necessitates highly skilled ophthalmologists. In addition,
ACCEPTED MANUSCRIPT
considering a great amount of images from various patients, manual evaluation of them becomes infeasible when considering the required time and steadiness. Hence, computer aided detection (CAD) systems for retinal evaluation are in demand. There have been many works about automatic evaluation of the fundus images [2–
CR IP T
11],[12–15]. The published works can be categorized into two classes namely image processing and deep learning (DL) based methods. The image processing based methods cover a dozen of image processing and machine learning based methods [2–7]. In [2],
AN US
authors used 2-D Gabor wavelet filters and a linear classifier for retinal vessel detection. Authors determined the class label of each pixel (vessel or not) based on the obtained feature vector which was comprised of pixel intensity and multi-scale Gabor wavelet coefficients. A
M
Morphology based retinal vessel segmentation algorithm was introduced by Dash et al. in [3]. Authors initially used contrast limited adaptive histogram equalization (CLAHE) for retinal
ED
image enhancement and then vessel segmentation was achieved by geodesic operators. The
PT
obtained segmentation was then refined with post-processing stage. Another retinal vessel segmentation algorithm was introduced by Zhao et al. in [4] where level sets and region
CE
growing methods were combined for an efficient segmentation scheme. As it was applied in
AC
the previous method [3], an image enhancement was applied based on CLAHE and anisotropic diffusion filtering. In [5], authors proposed a method for retinal vessel segmentation based on shearlet transform and indeterminacy filtering [6]. The green channel of the retinal image was initially transformed into the neutrosophic domain, and then shearlet features were extracted. Neural networks classifier was adopted for classification of the retinal images into vessel or non-vessel categories. Frangi filter and structure tensor was
ACCEPTED MANUSCRIPT
applied by Nergiz et al. in [6] for efficient retinal vessel segmentation. The obtained tensor field was then converted to a 3-D space (energy, anisotropy and orientation). Enhanced 3-D space was then converted to 4-D tensor field. Otsu thresholding was applied for final segmentation operation. Bankhead et al. [7] proposed a simple approach for retinal vessel
wavelet coefficients for retinal vessel extraction.
CR IP T
detection based on wavelet transform. Authors used a thresholding method on obtained
Convolutional neural networks (CNN), one of the deep learning methods, has been
AN US
widely used in literature [8–15]. Sengur et al. formulated the retinal vessel detection problem as a classification problem and applied CNN [8]. The adopted CNN architecture has two convolution layers, two pooling layers, one drop-out and one loss layer, respectively.
M
Dasgupta et al. proposed fully connected CNN and structured-prediction approach to segment the blood vessels in retinal images [9]. The proposed CNN architecture has joint loss function
ED
which could learn about the class label dependencies of neighboring pixels. Fu et al. proposed
PT
a deep two stage scheme for retinal vessel extraction in fundus images [10]. It initially applied a multi-scale and multi-level CNN with a side-output layer to learn a rich hierarchical
CE
representation. Then, conditional-random-field was used to model the long-range interactions
AC
between pixels. Maninis et al. presented deep retinal image understanding for both vessel and optic disc segmentation [11]. Maji et al. introduced a framework of deep and ensemble learning for retinal vessels detection in fundus images [12], where ensemble of various CNNs classified the retinal images to vessel and non-vessel regions. In [13], Liskowskiand et al. proposed a deep neural network based supervised detection algorithm for retinal vessel segmentation. It was considered as a set of pre-processing stages namely contrast
ACCEPTED MANUSCRIPT
normalization, whitening, and geometric transformations and gamma corrections, respectively. Lahiri et al. proposed an ensemble of deep learning for retinal vessel segmentation [14] and it employed an unsupervised hierarchical feature learning using ensemble of two level of sparsely trained de-noised stacked auto-encoder.
CR IP T
In this paper, we formulate the retinal vessel detection task as a classification problem and solve it using a multiple deep convolutional neural network (MDCNN) as a voted classifier. The proposed MDCNN is trained using an incremental strategy to improve their performance
AN US
and training speed than the traditional DCNN. The experimental results show that the MDCNN achieves better performance and significantly outperforms the state-of-the-art for automatic retinal vessel segmentation on the DRIVE and STARE datasets.
M
The remainder of the paper is organized as follows. In the next sections, we present the proposed MDCNN model using a novel learning scheme. Then the experimental results are
PT
ED
discussed and the conclusions are presented in the final section.
II.
Proposed Method
CE
2.1 Network Construction
AC
In contrast to traditional classification approaches, CNN is able to extract different features automatically through adaption of its multi-layer and feed-forward structure. In the proposed model, the input to the CNN is a ROI patch whose size is 64×64×1 and the output is the classification results of the pixels in the ROI. We use six types of layers to construct the CNN network, which are the convolution, rectified linear unit (ReLu), pooling, deconvolution, softmax, and pixel classification layers. The detailed network structure is shown in Table I.
ACCEPTED MANUSCRIPT
Table I. Structure and configuration of each CNN model (I: input layer; C: convolution layer; ReLu: rectified linear unit layer; P: pooling layer; D: deconvolution layer; F: fully connected layer; S: softmax layer; PC: pixel classification layer) Layer
Layer
Input
Filter
Pooling
Stride
Padding
Crop
Output
number
type
size
size
size
number
size
size
size
1
I
64×64×1
-
-
-
-
-
-
2
C
1
1
3
ReLu
4
P
5
C
6
ReLu
7
D
4×4×64
2
8
C
1×1×2
1
9
softmax
10
PC
3×3×64
64×64×1
2×2
2
3×3×64
1
CR IP T
64×64×1
0
32×32×1
1
32×32×1 32×32×1
1
64×64×1 64×64×1
AN US
64×64×1 64×64×1
In the convolutional layer, let 𝑥 𝑙−1 (𝑚) be the mth input feature at the layer l-1, and 𝑊 𝑙 (𝑚, 𝑛) the weight of filter which connects nth feature of the output layer to mth feature of
M
the input layer, and 𝑏 𝑙 (𝑛) a bias. The values 𝑥 𝑙 (n) in the lth convolutional layer is
ED
computed as:
(∑(𝑥 𝑙−1 (𝑚) 𝑊 𝑙 (𝑚, 𝑛)
PT
𝑥 𝑙 (n)
(𝑥)
CE
−
(1)
(2)
is the convolutional operation and f is a nonlinear sigmoid function. The weights of
AC
where
𝑏 𝑙 (𝑛)))
the filter 𝑊 𝑙 (𝑚, 𝑚) are initialized randomly and then updated by a backpropagation algorithm.
The rectified linear unit (ReLu) layer performs thresholding operation to each element of the input, where any value less than zero is set to zero. The ReLu is defined as; 𝑓(𝑥)
𝑥 { 0
𝑥≥0 𝑥<0
(3)
ACCEPTED MANUSCRIPT
A pooling layer is used to reduce the spatial size of the feature maps. It can alleviate the parameters numbers in the network and avoid overfitting problem. The values 𝑥 𝑙 (n) in the pooling layer l are computed as: 𝑥 𝑙 (n) ( ) is a sampling function.
(4)
CR IP T
where
(𝑥 𝑙−1 (𝑛))
A deconvolution layer is used to up-sample the features maps which has similar equations as convolution layer. The fully connected layer is employed to flatten the feature map and
AN US
connect them to output layer, which multiplies the input by a weight matrix and then adds a bias vector. A softmax layer applies a softmax function to the input from the fully connected layer, and a softmax function is defined as:
)
−
(5)
M
(
ED
where y is the class label and w is the weight.
The pixel classification layer returns classification results for the pixels in the ROI.
PT
2.2 Multiple Deep Convolution Neural Networks
CE
A multiple DCNNs framework is constructed by cascading multiple networks with same
AC
structures as:
*
1,
,
,
+
(6)
where N the total number of DCNN in the framework. In our experiment, it is selected as 5 which is determined using a trial-and-error method. The MDCNN is trained using an incremental learning strategy to improve the networks’ performance. The next DCNN is trained using the same samples as previous one and
ACCEPTED MANUSCRIPT
enhanced by learning the samples which are not performed well in the previous DCNN. In this way, the next DCNN will overcome the poor performance of the previous one. ( ( ) and
(
respectively, and n
( ), n
*
( )+
(7)
) are the sample for the kth and (k+1)th DCNN,
( ) are the samples which are wrongly classified by the kth
CR IP T
where
)
DCNN.
AN US
2.3 Classification Using Voting Scheme
The final decision on the pixel belonging is determined using a voting scheme on the multiple DCNNs results.
*
( , )+
M
(, )
ED
(, )
1
0
𝑥 𝑥
PT
{
CE
where L(i,j) is the final classification label of the pixel at (i,j),
AC
(9)
∑ ( ( , ), 𝑚)
(𝑥, )
different categories, and
(8)
(10) ( , ) is the vote results for
( , ) is the classification result of nth DCNN model.
III.
Experimental Results
3.1 Dataset The Digital Retinal Images for Vessel Extraction (DRIVE) dataset contains totally 40 fundus images and has been divided into training and test sets [4]. Training and test sets contain an equal number of images (20). Each image was captured using 8 bits per color plane
ACCEPTED MANUSCRIPT
at 768 by 584 pixels. The field of view (FOV) on each image is circular with a diameter of approximately 540 pixels and all images were cropped using FOV. The Structured Analysis of the Retina (STARE) database consists of 20 color retinal images [4]. These retinal images were digitized to 8 bits per color channel, 700 by 605 pixels. These two datasets are used to
CR IP T
analyze the performance of the retinal blood vessel segmentation with respect to ground truth images and consist of both normal and abnormal retinal images.
AN US
3.2 Experiment on retinal vessel detection
In this section, we train the proposed network using the DRIVE dataset. The learning rate is gradually decreased from 0.01 to 0.001. The learning rate is set as a piecewise function and
M
the drop period is 5 epochs and the drop factor is 0.005. Five CNNs with same structure are
a batch size of 256.
ED
employed to construct the MCNN model. All CNN networks are trained for 100 epochs with
PT
All experiments are run on a server with a 2 × Six-Core Intel Xeon processor and 128GB of memory. The server is equipped with two NVIDIA Tesla K40 GPUs and each of them with
CE
12GB of memory. It takes around 4 hours to train the proposed model. In the prediction stage,
AC
the input image is divided into 64×64×1 patches and sent to the model to detect vessel pixels. It takes 114 seconds on a single image averagely. The detection performance is evaluated using the accuracy merits. In addition, a receiver
operator curve (ROC) and area under the ROC (AUC) are employed to measure the detection performance. The accuracy metrics and ROC are commonly used to evaluate the performance of different retinal vessel segmentation algorithms.
ACCEPTED MANUSCRIPT
Table II. Performance of the proposed network on different datasets. Data set DRIVE training set DRIVE testing set
Sensitivity
Specificity
Accuracy
Dice
0.972601
0.986051
0.681030
0.959723
0.744831
0.973742
0.985956
0.704607
0.961316
0.761357
0.953929
0.986149
0.562878
0.953963
0.650279
CR IP T
STARE
AUC
As seen in Table II, the accuracy of the proposed method is 96.13% and 95.39%, the AUC is 0.9737 and 0.9539 on the testing set of DRIVE and STARE set, respectively. The
AN US
ROC curves are drawn on different sets in Figs. 1 to 3. Fig. 4 shows the visualization results
AC
CE
PT
ED
M
of our proposed method.
Fig. 1. ROC curve on the training set of DRIVE.
AN US
CR IP T
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
Fig. 2. ROC curve on the testing set of DRIVE.
Fig. 3. ROC curve on STARE set.
AC PT
CE
04_test
ED M
02_test
CR IP T
AN US
03_test
01_test
ACCEPTED MANUSCRIPT
14_test
ACCEPTED MANUSCRIPT
(b)
(c)
CR IP T
(a)
Fig. 4. Visualization of the detection results by our proposed methods on samples randomly taken from the DRIVE test dataset: (a) Original images (b) Corresponding ground truth and (c) Detection results.
We further compare our result with several state of the art methods using AUC values as
AN US
shown in Table III. As seen in Table III, our proposal achieves the third best AUC value for DRIVE dataset. Dasgupta et al. [9] yield a better result than ours, and the difference is 0.007 and the study by Qiaoliang et al. [16] achieved the second best AUC value, and the difference
M
with the proposed method is only 0.001.
ED
Table III. Comparison with the state of art methods in DRIVE dataset Method
AUC 0.9500
Maji et al. [12]
0.9470
Fu et al. [10]
0.9523
Soares et al. [2]
0.9614
Niemeijer et al. [15]
0.9294
Dasgupta et al. [9]
0.9744
Guo et al. [5]
0.9476
Sengur et al. [8]
0.9674
Azzopardi et al. [17]
0.9614
Osareh et al. [18]
0.965
Roychowdhury et al. [19]
0.962
Qiaoliang et al. [16]
0.9738
Proposed method
0.9737
AC
CE
PT
Lahiri et al. [14]
IV.
Conclusion
The deep neural network is able to learn hierarchical feature representations from the raw
ACCEPTED MANUSCRIPT
pixel data without any domain knowledge. It has tremendous potential in the medical image where knowledge-based features are hard to interpret. In this paper, we propose a multiple convolutional neural networks method for retinal vessel detection. We demonstrate the performance of our proposed method on the DRIVE and STARE databases. The obtained
CR IP T
results and related comparisons show the effectiveness of our proposal. Our method yields the third best AUC value in the compared the state of the art methods on the DRIVE dataset. It is worth mentioning that our proposal has no preprocessing stage, which means that the input
AN US
color fundus images are fed into the CNN. Therefore, several proper preprocessing steps may also improve our results in future work. In addition, this MDCNN model is trained using an
M
incremental learning strategy to improve the networks’ performance.
Conflict of Interest Statement
ED
We declare that we have no financial and personal relationships with other people
PT
or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or
CE
company that could be construed as influencing the position presented in, or the
AC
review of, the manuscript entitled, “A Novel Retinal Vessel Detection Approach Based on Multiple Deep Convolution Neural Networks”.
References
[1]
J.J. Kanski, Clinical ophthalmology: a systematic approach, London ,U.K.:
ACCEPTED MANUSCRIPT
Butterworth-Heinemann, 1989. [2]
J.V.B. Soares, J.J.G. Leandro, R.M. Cesar, H.F. Jelinek, M.J. Cree, Retinal vessel segmentation using the 2-D Gabor wavelet and supervised classification, IEEE Trans. Med. Imaging. 25 (2006) 1214–1222. doi:10.1109/TMI.2006.879967. J. Dash, N. Bhoi, Detection of Retinal Blood Vessels from Ophthalmoscope Images
CR IP T
[3]
Using Morphological Approach, Electron. Lett. Comput. Vis. Image Anal. 16 (2017) 1–14. doi:https://doi.org/10.5565/rev/elcvia.913.
Y. Qian Zhao, X. Hong Wang, X. Fang Wang, F.Y. Shih, Retinal vessels segmentation
AN US
[4]
based on level set and region growing, Pattern Recognit. 47 (2014) 2437–2446. doi:10.1016/j.patcog.2014.01.006.
Y. Guo, Ü. Budak, A. Şengür, F. Smarandache, A Retinal Vessel Detection Approach
M
[5]
Based on Shearlet Transform and Indeterminacy Filtering on Fundus Images,
M. Nergiz, M. Akın, Retinal Vessel Segmentation via Structure Tensor Coloring and
PT
[6]
ED
Symmetry (Basel). 9 (2017) 235. doi:10.3390/sym9100235.
Anisotropy Enhancement, Symmetry (Basel). 9 (2017) 276. doi:10.3390/sym9110276. P. Bankhead, C.N. Scholfield, J.G. McGeown, T.M. Curtis, Fast Retinal Vessel
CE
[7]
AC
Detection and Measurement Using Wavelets and Edge Location Refinement, PLoS One. 7 (2012) e32435. doi:10.1371/journal.pone.0032435.
[8]
A. Şengür, Y. Guo, Ü. Budak, L. Vespa, A retinal vessel detection approach using convolution neural network, in: 2017 Int. Artif. Intell. Data Process. Symp., 2017: pp. 1–4.
[9]
A. Dasgupta, S. Singh, A Fully Convolutional Neural Network based Structured
ACCEPTED MANUSCRIPT
Prediction Approach Towards The Retinal Vessel Segmentation, in: 2017 IEEE 14th Int.
Symp.
Biomed.
Imaging
(ISBI
2017),
IEEE,
2017:
pp.
248–251.
doi:10.1109/ISBI.2017.7950512. [10]
H. Fu, Y. Xu, S. Lin, D.W.K. Wong, J. Liu, Deepvessel: Retinal Vessel Segmentation
CR IP T
via Deep Learning and Conditional Random Field, Springer International Publishing, Cham, 2016. doi:10.1007/978-3-319-46723-8.
K.K. Maninis, J. Pont-Tuset, P. Arbeláez, L. Van Gool, Deep Retinal Image Understanding,
Springer
International
doi:10.1007/978-3-319-46723-8. [12]
Publishing,
Cham,
2016.
AN US
[11]
D. Maji, A. Santara, P. Mitra, D. Sheet, Ensemble of Deep Convolutional Neural
M
Networks for Learning to Detect Retinal Vessels in Fundus Images, ArXiv Prepr. ArXiv1603.04833. (2016). http://arxiv.org/abs/1603.04833. P. Liskowski, K. Krawiec, Segmenting Retinal Blood Vessels with Deep Neural IEEE
Trans.
Med.
Imaging.
35
(2016)
2369–2380.
PT
Networks,
ED
[13]
doi:10.1109/TMI.2016.2546227. A. Lahiri, A.G. Roy, D. Sheet, P.K. Biswas, Deep Neural Ensemble for Retinal Vessel
CE
[14]
AC
Segmentation in Fundus Images Towards Achieving Label-Free Angiography, in: 2016 38th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., IEEE, 2016: pp. 1340–1343. doi:10.1109/EMBC.2016.7590955.
[15]
M. Niemeijer, J. Staal, B. van Ginneken, M. Loog, M.D. Abramoff, Comparative study of retinal vessel segmentation methods on a new publicly available database, in: J.M. Fitzpatrick,
M.
Sonka
(Eds.),
Med.
Imaging,
2004:
pp.
648–657.
ACCEPTED MANUSCRIPT
doi:10.1117/12.535349. [16]
Q. Li, B. Feng, L. Xie, P. Liang, H. Zhang, T. Wang, A Cross-Modality Learning Approach for Vessel Segmentation in Retinal Images, IEEE Trans. Med. Imaging. 35 (2016) 109–118. doi:10.1109/TMI.2015.2457891. G. Azzopardi, N. Strisciuglio, M. Vento, N. Petkov, Trainable COSFIRE Filters for
CR IP T
[17]
Vessel Delineation with Application to Retinal Images, Med. Image Anal. 19 (2015) 46–57. doi:10.1016/j.media.2014.08.002.
A. Osareh, B. Shadgar, Automatic Blood Vessel Segmentation in Color Images of
AN US
[18]
Retina, Iran. J. Sci. Technol. 33 (2009) 191–206. [19]
S. Roychowdhury, D. Koozekanani, K. Parhi, Blood Vessel Segmentation of Fundus
M
Images by Major Vessel Extraction and Sub-Image Classification, IEEE J. Biomed.
AC
CE
PT
ED
Heal. Informatics. 19 (2014) 1–1. doi:10.1109/JBHI.2014.2335617.