A novel retinal vessel detection approach based on multiple deep convolution neural networks

A novel retinal vessel detection approach based on multiple deep convolution neural networks

Accepted Manuscript A Novel Retinal Vessel Detection Approach Based on Multiple Deep Convolution Neural Networks ¨ Yanhui Guo , Umit Budak , Abdulkad...

909KB Sizes 0 Downloads 30 Views

Accepted Manuscript

A Novel Retinal Vessel Detection Approach Based on Multiple Deep Convolution Neural Networks ¨ Yanhui Guo , Umit Budak , Abdulkadir S¸engur ¨ PII: DOI: Reference:

S0169-2607(18)30781-8 https://doi.org/10.1016/j.cmpb.2018.10.021 COMM 4809

To appear in:

Computer Methods and Programs in Biomedicine

Received date: Revised date: Accepted date:

23 May 2018 12 October 2018 29 October 2018

¨ Please cite this article as: Yanhui Guo , Umit Budak , Abdulkadir S¸engur ¨ , A Novel Retinal Vessel Detection Approach Based on Multiple Deep Convolution Neural Networks, Computer Methods and Programs in Biomedicine (2018), doi: https://doi.org/10.1016/j.cmpb.2018.10.021

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Highlights 



AC

CE

PT

ED

M

AN US

CR IP T



This study formulates the retinal vessel detection task as a classification problem and solves it using a multiple classifier framework based on deep convolutional neural networks. The MDCNN is trained using an incremental learning strategy to improve the networks’ performance. The final classification results are obtained from the voting procedure on the results of MDCNN. The MDCNN achieves better performance and significantly outperforms the state-of-the-art for automatic retinal vessel segmentation on the DRIVE and STARE datasets.

ACCEPTED MANUSCRIPT

A Novel Retinal Vessel Detection Approach Based on Multiple Deep Convolution Neural Networks Yanhui Guo1, Ümit Budak2, and Abdulkadir Şengür3 1 2

Department of Computer Science, University of Illinois, Springfield, Illinois, USA

Department of Electrical-Electronics Engineering, Bitlis Eren University, Bitlis, Turkey 3

Electrical and Electronics Engineering Department, Firat University, Elazig, Turkey.

Abstract

AC

CE

PT

ED

M

AN US

CR IP T

Background and Objective: Computer aided detection (CAD) offers an efficient way to assist doctors to interpret fundus images. In a CAD system, retinal vessel (RV) detection is a crucial step to identify the retinal disease regions. However, RV detection is still a challenging problem due to variations in morphology of the vessels on noisy and low contrast fundus images. Methods: In this paper, we formulate the detection task as a classification problem and solve it using a multiple classifier framework based on deep convolutional neural networks. The multiple deep convolutional neural network (MDCNN) is constructed and trained on fundus images with limited image quantity. The MDCNN is trained using an incremental learning strategy to improve the networks’ performance. The final classification results are obtained from the voting procedure on the results of MDCNN. Results: The MDCNN achieves better performance and significantly outperforms the state-of-the-art for automatic retinal vessel segmentation on the DRIVE dataset with 95.97% and 96.13% accuracy and 0.9726 and 0.9737 AUC (area below the operator receiver character curve) score on training and testing sets, respectively. Another public dataset, STARE, is also used to evaluate the proposed network. The experimental results demonstrate that the proposed MDCNN network achieves 95.39% accuracy and 0.9539 AUC score in STARE dataset. We further compare our result with several state-of-the-art methods based on AUC values. The comparison is shown that our proposal yields the third best AUC value. Conclusions: Our method yields the better performance in the compared the state of the art methods. In addition, our proposal has no preprocessing stage, and the input color fundus images are fed into the CNN directly. Keywords: Retinal vessels segmentation; Multiple deep convolution neural network; Image segmentation.

I.

Introduction

A fundus camera is a useful tool to present a detailed view of the back of eye [1]. It

enables the ophthalmologist to screen the retina and detect the early signs of potential eye diseases such as arteriosclerosis, hypertension, cardiovascular and diabetes [1]. Manuel evaluation of fundus camera images necessitates highly skilled ophthalmologists. In addition,

ACCEPTED MANUSCRIPT

considering a great amount of images from various patients, manual evaluation of them becomes infeasible when considering the required time and steadiness. Hence, computer aided detection (CAD) systems for retinal evaluation are in demand. There have been many works about automatic evaluation of the fundus images [2–

CR IP T

11],[12–15]. The published works can be categorized into two classes namely image processing and deep learning (DL) based methods. The image processing based methods cover a dozen of image processing and machine learning based methods [2–7]. In [2],

AN US

authors used 2-D Gabor wavelet filters and a linear classifier for retinal vessel detection. Authors determined the class label of each pixel (vessel or not) based on the obtained feature vector which was comprised of pixel intensity and multi-scale Gabor wavelet coefficients. A

M

Morphology based retinal vessel segmentation algorithm was introduced by Dash et al. in [3]. Authors initially used contrast limited adaptive histogram equalization (CLAHE) for retinal

ED

image enhancement and then vessel segmentation was achieved by geodesic operators. The

PT

obtained segmentation was then refined with post-processing stage. Another retinal vessel segmentation algorithm was introduced by Zhao et al. in [4] where level sets and region

CE

growing methods were combined for an efficient segmentation scheme. As it was applied in

AC

the previous method [3], an image enhancement was applied based on CLAHE and anisotropic diffusion filtering. In [5], authors proposed a method for retinal vessel segmentation based on shearlet transform and indeterminacy filtering [6]. The green channel of the retinal image was initially transformed into the neutrosophic domain, and then shearlet features were extracted. Neural networks classifier was adopted for classification of the retinal images into vessel or non-vessel categories. Frangi filter and structure tensor was

ACCEPTED MANUSCRIPT

applied by Nergiz et al. in [6] for efficient retinal vessel segmentation. The obtained tensor field was then converted to a 3-D space (energy, anisotropy and orientation). Enhanced 3-D space was then converted to 4-D tensor field. Otsu thresholding was applied for final segmentation operation. Bankhead et al. [7] proposed a simple approach for retinal vessel

wavelet coefficients for retinal vessel extraction.

CR IP T

detection based on wavelet transform. Authors used a thresholding method on obtained

Convolutional neural networks (CNN), one of the deep learning methods, has been

AN US

widely used in literature [8–15]. Sengur et al. formulated the retinal vessel detection problem as a classification problem and applied CNN [8]. The adopted CNN architecture has two convolution layers, two pooling layers, one drop-out and one loss layer, respectively.

M

Dasgupta et al. proposed fully connected CNN and structured-prediction approach to segment the blood vessels in retinal images [9]. The proposed CNN architecture has joint loss function

ED

which could learn about the class label dependencies of neighboring pixels. Fu et al. proposed

PT

a deep two stage scheme for retinal vessel extraction in fundus images [10]. It initially applied a multi-scale and multi-level CNN with a side-output layer to learn a rich hierarchical

CE

representation. Then, conditional-random-field was used to model the long-range interactions

AC

between pixels. Maninis et al. presented deep retinal image understanding for both vessel and optic disc segmentation [11]. Maji et al. introduced a framework of deep and ensemble learning for retinal vessels detection in fundus images [12], where ensemble of various CNNs classified the retinal images to vessel and non-vessel regions. In [13], Liskowskiand et al. proposed a deep neural network based supervised detection algorithm for retinal vessel segmentation. It was considered as a set of pre-processing stages namely contrast

ACCEPTED MANUSCRIPT

normalization, whitening, and geometric transformations and gamma corrections, respectively. Lahiri et al. proposed an ensemble of deep learning for retinal vessel segmentation [14] and it employed an unsupervised hierarchical feature learning using ensemble of two level of sparsely trained de-noised stacked auto-encoder.

CR IP T

In this paper, we formulate the retinal vessel detection task as a classification problem and solve it using a multiple deep convolutional neural network (MDCNN) as a voted classifier. The proposed MDCNN is trained using an incremental strategy to improve their performance

AN US

and training speed than the traditional DCNN. The experimental results show that the MDCNN achieves better performance and significantly outperforms the state-of-the-art for automatic retinal vessel segmentation on the DRIVE and STARE datasets.

M

The remainder of the paper is organized as follows. In the next sections, we present the proposed MDCNN model using a novel learning scheme. Then the experimental results are

PT

ED

discussed and the conclusions are presented in the final section.

II.

Proposed Method

CE

2.1 Network Construction

AC

In contrast to traditional classification approaches, CNN is able to extract different features automatically through adaption of its multi-layer and feed-forward structure. In the proposed model, the input to the CNN is a ROI patch whose size is 64×64×1 and the output is the classification results of the pixels in the ROI. We use six types of layers to construct the CNN network, which are the convolution, rectified linear unit (ReLu), pooling, deconvolution, softmax, and pixel classification layers. The detailed network structure is shown in Table I.

ACCEPTED MANUSCRIPT

Table I. Structure and configuration of each CNN model (I: input layer; C: convolution layer; ReLu: rectified linear unit layer; P: pooling layer; D: deconvolution layer; F: fully connected layer; S: softmax layer; PC: pixel classification layer) Layer

Layer

Input

Filter

Pooling

Stride

Padding

Crop

Output

number

type

size

size

size

number

size

size

size

1

I

64×64×1

-

-

-

-

-

-

2

C

1

1

3

ReLu

4

P

5

C

6

ReLu

7

D

4×4×64

2

8

C

1×1×2

1

9

softmax

10

PC

3×3×64

64×64×1

2×2

2

3×3×64

1

CR IP T

64×64×1

0

32×32×1

1

32×32×1 32×32×1

1

64×64×1 64×64×1

AN US

64×64×1 64×64×1

In the convolutional layer, let 𝑥 𝑙−1 (𝑚) be the mth input feature at the layer l-1, and 𝑊 𝑙 (𝑚, 𝑛) the weight of filter which connects nth feature of the output layer to mth feature of

M

the input layer, and 𝑏 𝑙 (𝑛) a bias. The values 𝑥 𝑙 (n) in the lth convolutional layer is

ED

computed as:

(∑(𝑥 𝑙−1 (𝑚) 𝑊 𝑙 (𝑚, 𝑛)

PT

𝑥 𝑙 (n)

(𝑥)

CE



(1)

(2)

is the convolutional operation and f is a nonlinear sigmoid function. The weights of

AC

where

𝑏 𝑙 (𝑛)))

the filter 𝑊 𝑙 (𝑚, 𝑚) are initialized randomly and then updated by a backpropagation algorithm.

The rectified linear unit (ReLu) layer performs thresholding operation to each element of the input, where any value less than zero is set to zero. The ReLu is defined as; 𝑓(𝑥)

𝑥 { 0

𝑥≥0 𝑥<0

(3)

ACCEPTED MANUSCRIPT

A pooling layer is used to reduce the spatial size of the feature maps. It can alleviate the parameters numbers in the network and avoid overfitting problem. The values 𝑥 𝑙 (n) in the pooling layer l are computed as: 𝑥 𝑙 (n) ( ) is a sampling function.

(4)

CR IP T

where

(𝑥 𝑙−1 (𝑛))

A deconvolution layer is used to up-sample the features maps which has similar equations as convolution layer. The fully connected layer is employed to flatten the feature map and

AN US

connect them to output layer, which multiplies the input by a weight matrix and then adds a bias vector. A softmax layer applies a softmax function to the input from the fully connected layer, and a softmax function is defined as:

)



(5)

M

(

ED

where y is the class label and w is the weight.

The pixel classification layer returns classification results for the pixels in the ROI.

PT

2.2 Multiple Deep Convolution Neural Networks

CE

A multiple DCNNs framework is constructed by cascading multiple networks with same

AC

structures as:

*

1,

,

,

+

(6)

where N the total number of DCNN in the framework. In our experiment, it is selected as 5 which is determined using a trial-and-error method. The MDCNN is trained using an incremental learning strategy to improve the networks’ performance. The next DCNN is trained using the same samples as previous one and

ACCEPTED MANUSCRIPT

enhanced by learning the samples which are not performed well in the previous DCNN. In this way, the next DCNN will overcome the poor performance of the previous one. ( ( ) and

(

respectively, and n

( ), n

*

( )+

(7)

) are the sample for the kth and (k+1)th DCNN,

( ) are the samples which are wrongly classified by the kth

CR IP T

where

)

DCNN.

AN US

2.3 Classification Using Voting Scheme

The final decision on the pixel belonging is determined using a voting scheme on the multiple DCNNs results.

*

( , )+

M

(, )

ED

(, )

1

0

𝑥 𝑥

PT

{

CE

where L(i,j) is the final classification label of the pixel at (i,j),

AC

(9)

∑ ( ( , ), 𝑚)

(𝑥, )

different categories, and

(8)

(10) ( , ) is the vote results for

( , ) is the classification result of nth DCNN model.

III.

Experimental Results

3.1 Dataset The Digital Retinal Images for Vessel Extraction (DRIVE) dataset contains totally 40 fundus images and has been divided into training and test sets [4]. Training and test sets contain an equal number of images (20). Each image was captured using 8 bits per color plane

ACCEPTED MANUSCRIPT

at 768 by 584 pixels. The field of view (FOV) on each image is circular with a diameter of approximately 540 pixels and all images were cropped using FOV. The Structured Analysis of the Retina (STARE) database consists of 20 color retinal images [4]. These retinal images were digitized to 8 bits per color channel, 700 by 605 pixels. These two datasets are used to

CR IP T

analyze the performance of the retinal blood vessel segmentation with respect to ground truth images and consist of both normal and abnormal retinal images.

AN US

3.2 Experiment on retinal vessel detection

In this section, we train the proposed network using the DRIVE dataset. The learning rate is gradually decreased from 0.01 to 0.001. The learning rate is set as a piecewise function and

M

the drop period is 5 epochs and the drop factor is 0.005. Five CNNs with same structure are

a batch size of 256.

ED

employed to construct the MCNN model. All CNN networks are trained for 100 epochs with

PT

All experiments are run on a server with a 2 × Six-Core Intel Xeon processor and 128GB of memory. The server is equipped with two NVIDIA Tesla K40 GPUs and each of them with

CE

12GB of memory. It takes around 4 hours to train the proposed model. In the prediction stage,

AC

the input image is divided into 64×64×1 patches and sent to the model to detect vessel pixels. It takes 114 seconds on a single image averagely. The detection performance is evaluated using the accuracy merits. In addition, a receiver

operator curve (ROC) and area under the ROC (AUC) are employed to measure the detection performance. The accuracy metrics and ROC are commonly used to evaluate the performance of different retinal vessel segmentation algorithms.

ACCEPTED MANUSCRIPT

Table II. Performance of the proposed network on different datasets. Data set DRIVE training set DRIVE testing set

Sensitivity

Specificity

Accuracy

Dice

0.972601

0.986051

0.681030

0.959723

0.744831

0.973742

0.985956

0.704607

0.961316

0.761357

0.953929

0.986149

0.562878

0.953963

0.650279

CR IP T

STARE

AUC

As seen in Table II, the accuracy of the proposed method is 96.13% and 95.39%, the AUC is 0.9737 and 0.9539 on the testing set of DRIVE and STARE set, respectively. The

AN US

ROC curves are drawn on different sets in Figs. 1 to 3. Fig. 4 shows the visualization results

AC

CE

PT

ED

M

of our proposed method.

Fig. 1. ROC curve on the training set of DRIVE.

AN US

CR IP T

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

Fig. 2. ROC curve on the testing set of DRIVE.

Fig. 3. ROC curve on STARE set.

AC PT

CE

04_test

ED M

02_test

CR IP T

AN US

03_test

01_test

ACCEPTED MANUSCRIPT

14_test

ACCEPTED MANUSCRIPT

(b)

(c)

CR IP T

(a)

Fig. 4. Visualization of the detection results by our proposed methods on samples randomly taken from the DRIVE test dataset: (a) Original images (b) Corresponding ground truth and (c) Detection results.

We further compare our result with several state of the art methods using AUC values as

AN US

shown in Table III. As seen in Table III, our proposal achieves the third best AUC value for DRIVE dataset. Dasgupta et al. [9] yield a better result than ours, and the difference is 0.007 and the study by Qiaoliang et al. [16] achieved the second best AUC value, and the difference

M

with the proposed method is only 0.001.

ED

Table III. Comparison with the state of art methods in DRIVE dataset Method

AUC 0.9500

Maji et al. [12]

0.9470

Fu et al. [10]

0.9523

Soares et al. [2]

0.9614

Niemeijer et al. [15]

0.9294

Dasgupta et al. [9]

0.9744

Guo et al. [5]

0.9476

Sengur et al. [8]

0.9674

Azzopardi et al. [17]

0.9614

Osareh et al. [18]

0.965

Roychowdhury et al. [19]

0.962

Qiaoliang et al. [16]

0.9738

Proposed method

0.9737

AC

CE

PT

Lahiri et al. [14]

IV.

Conclusion

The deep neural network is able to learn hierarchical feature representations from the raw

ACCEPTED MANUSCRIPT

pixel data without any domain knowledge. It has tremendous potential in the medical image where knowledge-based features are hard to interpret. In this paper, we propose a multiple convolutional neural networks method for retinal vessel detection. We demonstrate the performance of our proposed method on the DRIVE and STARE databases. The obtained

CR IP T

results and related comparisons show the effectiveness of our proposal. Our method yields the third best AUC value in the compared the state of the art methods on the DRIVE dataset. It is worth mentioning that our proposal has no preprocessing stage, which means that the input

AN US

color fundus images are fed into the CNN. Therefore, several proper preprocessing steps may also improve our results in future work. In addition, this MDCNN model is trained using an

M

incremental learning strategy to improve the networks’ performance.

Conflict of Interest Statement

ED

We declare that we have no financial and personal relationships with other people

PT

or organizations that can inappropriately influence our work, there is no professional or other personal interest of any nature or kind in any product, service and/or

CE

company that could be construed as influencing the position presented in, or the

AC

review of, the manuscript entitled, “A Novel Retinal Vessel Detection Approach Based on Multiple Deep Convolution Neural Networks”.

References

[1]

J.J. Kanski, Clinical ophthalmology: a systematic approach, London ,U.K.:

ACCEPTED MANUSCRIPT

Butterworth-Heinemann, 1989. [2]

J.V.B. Soares, J.J.G. Leandro, R.M. Cesar, H.F. Jelinek, M.J. Cree, Retinal vessel segmentation using the 2-D Gabor wavelet and supervised classification, IEEE Trans. Med. Imaging. 25 (2006) 1214–1222. doi:10.1109/TMI.2006.879967. J. Dash, N. Bhoi, Detection of Retinal Blood Vessels from Ophthalmoscope Images

CR IP T

[3]

Using Morphological Approach, Electron. Lett. Comput. Vis. Image Anal. 16 (2017) 1–14. doi:https://doi.org/10.5565/rev/elcvia.913.

Y. Qian Zhao, X. Hong Wang, X. Fang Wang, F.Y. Shih, Retinal vessels segmentation

AN US

[4]

based on level set and region growing, Pattern Recognit. 47 (2014) 2437–2446. doi:10.1016/j.patcog.2014.01.006.

Y. Guo, Ü. Budak, A. Şengür, F. Smarandache, A Retinal Vessel Detection Approach

M

[5]

Based on Shearlet Transform and Indeterminacy Filtering on Fundus Images,

M. Nergiz, M. Akın, Retinal Vessel Segmentation via Structure Tensor Coloring and

PT

[6]

ED

Symmetry (Basel). 9 (2017) 235. doi:10.3390/sym9100235.

Anisotropy Enhancement, Symmetry (Basel). 9 (2017) 276. doi:10.3390/sym9110276. P. Bankhead, C.N. Scholfield, J.G. McGeown, T.M. Curtis, Fast Retinal Vessel

CE

[7]

AC

Detection and Measurement Using Wavelets and Edge Location Refinement, PLoS One. 7 (2012) e32435. doi:10.1371/journal.pone.0032435.

[8]

A. Şengür, Y. Guo, Ü. Budak, L. Vespa, A retinal vessel detection approach using convolution neural network, in: 2017 Int. Artif. Intell. Data Process. Symp., 2017: pp. 1–4.

[9]

A. Dasgupta, S. Singh, A Fully Convolutional Neural Network based Structured

ACCEPTED MANUSCRIPT

Prediction Approach Towards The Retinal Vessel Segmentation, in: 2017 IEEE 14th Int.

Symp.

Biomed.

Imaging

(ISBI

2017),

IEEE,

2017:

pp.

248–251.

doi:10.1109/ISBI.2017.7950512. [10]

H. Fu, Y. Xu, S. Lin, D.W.K. Wong, J. Liu, Deepvessel: Retinal Vessel Segmentation

CR IP T

via Deep Learning and Conditional Random Field, Springer International Publishing, Cham, 2016. doi:10.1007/978-3-319-46723-8.

K.K. Maninis, J. Pont-Tuset, P. Arbeláez, L. Van Gool, Deep Retinal Image Understanding,

Springer

International

doi:10.1007/978-3-319-46723-8. [12]

Publishing,

Cham,

2016.

AN US

[11]

D. Maji, A. Santara, P. Mitra, D. Sheet, Ensemble of Deep Convolutional Neural

M

Networks for Learning to Detect Retinal Vessels in Fundus Images, ArXiv Prepr. ArXiv1603.04833. (2016). http://arxiv.org/abs/1603.04833. P. Liskowski, K. Krawiec, Segmenting Retinal Blood Vessels with Deep Neural IEEE

Trans.

Med.

Imaging.

35

(2016)

2369–2380.

PT

Networks,

ED

[13]

doi:10.1109/TMI.2016.2546227. A. Lahiri, A.G. Roy, D. Sheet, P.K. Biswas, Deep Neural Ensemble for Retinal Vessel

CE

[14]

AC

Segmentation in Fundus Images Towards Achieving Label-Free Angiography, in: 2016 38th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., IEEE, 2016: pp. 1340–1343. doi:10.1109/EMBC.2016.7590955.

[15]

M. Niemeijer, J. Staal, B. van Ginneken, M. Loog, M.D. Abramoff, Comparative study of retinal vessel segmentation methods on a new publicly available database, in: J.M. Fitzpatrick,

M.

Sonka

(Eds.),

Med.

Imaging,

2004:

pp.

648–657.

ACCEPTED MANUSCRIPT

doi:10.1117/12.535349. [16]

Q. Li, B. Feng, L. Xie, P. Liang, H. Zhang, T. Wang, A Cross-Modality Learning Approach for Vessel Segmentation in Retinal Images, IEEE Trans. Med. Imaging. 35 (2016) 109–118. doi:10.1109/TMI.2015.2457891. G. Azzopardi, N. Strisciuglio, M. Vento, N. Petkov, Trainable COSFIRE Filters for

CR IP T

[17]

Vessel Delineation with Application to Retinal Images, Med. Image Anal. 19 (2015) 46–57. doi:10.1016/j.media.2014.08.002.

A. Osareh, B. Shadgar, Automatic Blood Vessel Segmentation in Color Images of

AN US

[18]

Retina, Iran. J. Sci. Technol. 33 (2009) 191–206. [19]

S. Roychowdhury, D. Koozekanani, K. Parhi, Blood Vessel Segmentation of Fundus

M

Images by Major Vessel Extraction and Sub-Image Classification, IEEE J. Biomed.

AC

CE

PT

ED

Heal. Informatics. 19 (2014) 1–1. doi:10.1109/JBHI.2014.2335617.