BBE 422 1–13 biocybernetics and biomedical engineering xxx (2019) xxx–xxx
Available online at www.sciencedirect.com
ScienceDirect journal homepage: www.elsevier.com/locate/bbe 1 2 3
Multi-path convolutional neural network in fundus segmentation of blood vessels
4
5 6
Q1
Chun Tian, Tao Fang, Yingle Fan *, Wei Wu Laboratory of Pattern Recognition and Image Processing, Hangzhou Dianzi University, Hangzhou 310018, China
article info
abstract
Article history:
There is a close correlation between retinal vascular status and physical diseases such as eye Q2
Received 5 May 2019
lesions. Retinal fundus images are an important basis for diagnosing diseases such as
Received in revised form
diabetes, glaucoma, hypertension, coronary heart disease, etc. Because the thickness of the
22 January 2020
retinal blood vessels is different, the minimum diameter is only one or two pixels wide, so
Accepted 24 January 2020
obtaining accurate measurement results becomes critical and challenging. In this paper, we
Available online xxx
propose a new method of retinal blood vessel segmentation that is based on a multi-path convolutional neural network, which can be used for computer-based clinical medical image
Keywords:
analysis. First, a low-frequency image characterizing the overall characteristics of the retinal
Image segmentation
blood vessel image and a high-frequency image characterizing the local detailed features are
Convolutional neural network
respectively obtained by using a Gaussian low-pass filter and a Gaussian high-pass filter.
Dilated convolution
Then a feature extraction path is constructed for the characteristics of the low- and high-
Visual perception
frequency images, respectively. Finally, according to the response results of the low-
Feature fusion
frequency feature extraction path and the high-frequency feature extraction path, the whole blood vessel perception and local feature information fusion coding are realized, and the final blood vessel segmentation map is obtained. The performance of this method is evaluated and tested by DRIVE and CHASE_DB1. In the experimental results of the DRIVE database, the evaluation indexes accuracy (Acc), sensitivity (SE), and specificity (SP) are 0.9580, 0.8639, and 0.9665, respectively, and the evaluation indexes Acc, SE, and SP of the CHASE_DB1 database are 0.9601, 0.8778, and 0.9680, respectively. In addition, the method proposed in this paper could effectively suppress noise, ensure continuity after blood vessel segmentation, and provide a feasible new idea for intelligent visual perception of medical images. © 2020 Nalecz Institute of Biocybernetics and Biomedical Engineering of the Polish Academy of Sciences. Published by Elsevier B.V. All rights reserved.
9 78 10 11 12
13 14 15
1.
Introduction
Clinical studies have shown that retinal blood vessels are closely related to the health status of many organs throughout the body, e.g., diabetic fundus lesions caused by vascular and
vascular system lesions are important causes of blindness in humans. The fundus retina image is the only deeper microvascular system that can be observed non-invasively. Structural changes, such as blood vessel diameter, tortuosity, and color, can reflect clinical and pathological features such as hypertension, diabetes, and atherosclerosis; e.g., diabetic
* Corresponding author at: Laboratory of Pattern Recognition and Image Processing, Hangzhou Dianzi University, Hangzhou 310018, China. E-mail address:
[email protected] (Y. Fan). https://doi.org/10.1016/j.bbe.2020.01.011 0208-5216/© 2020 Nalecz Institute of Biocybernetics and Biomedical Engineering of the Polish Academy of Sciences. Published by Elsevier B.V. All rights reserved. Please cite this article in press as: Tian C, et al. Multi-path convolutional neural network in fundus segmentation of blood vessels. Biocybern Biomed Eng (2020), https://doi.org/10.1016/j.bbe.2020.01.011
16 17 18 19 20 21
BBE 422 1–13
2
biocybernetics and biomedical engineering xxx (2019) xxx–xxx
22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
retinopathy (DR) is the most common diabetic eye disease and is caused by damage to retinal blood vessels [1,2]. During retinal hyperplasia, abnormal neovascularization will grow on the surface of the retina, leading to loss of vision. Early symptoms of diabetes are characterized by retinal vein dilation and slight curvature. If the condition worsens, the width of the retinal vein becomes uneven, and in more serious cases, it is beaded and the diameter of the tube is severely curved. Retinal images provide valuable information for the human eye, and it is meaningful to prevent and treat these diseases by analyzing and diagnosing retinal blood vessels [3]. In the structure of the human eye, the retina is in the inner layer of the eyewall, which is the most sensitive area of visual information. The main physiological structures of the retinal fundus imaging map include the optic disk, the macula, and the blood vessels. The optic disk exhibits a brightly colored region that approximates a circle in a normal fundus image; this is the starting region of the optic nerve and blood vessels. The macula appears as a dark area in the fundus image due to its enriched lutein, and the area has no vascular structure. The main blood vessel extends from the optic disk region into the entire inner eyeball and is distributed in a tree-like manner throughout the fundus image. This region has the thickest blood vessels and the highest blood vessel density. Because the tiny blood vessels far from the optic disk also contribute to disease, the feature extraction of blood vessels and its accuracy are needed to help in diagnosis and avoid medical accidents. Therefore, in the process of retinal blood vessel segmentation, how to measure the width of the blood vessel and extract the features of fine blood vessels becomes crucial. However, the distribution of blood vessels in the retinal images of the fundus is dense and messy, there are many small blood vessels, and the influence of lesion noise leads to the traditional manual segmentation method is a huge and tedious work, which requires experts to complete after a lot of training and professional skills. In order to help ophthalmologists complete this complicated and tedious work and improve the diagnostic rate, the automatic quantitative retinal vascular segmentation technology assisted by computer is increasingly valued and needed [4,5]. Based on previous work, this paper proposes a method based on multi-path convolutional neural network, which can automatically segment retinal blood vessels quickly and accurately, and provides a new idea for subsequent medical image processing and analysis.
67
2.
68 69 70 71 72 73 74 75 76 77 78
The traditional method of blood vessel segmentation of the fundus image is based on feature extraction of retinal vessel spatial characteristics, and the matching and recognition of blood vessel state are realized by machine vision. However, the ability to generalize feature matching is limited because of the richness and inconsistency of the details of the fundus image. Since no single segmentation method is suitable for all different anatomical regions or imaging methods, the main purpose of this section is to provide relevant algorithms for vascular segmentation algorithms to select the most appropriate method according to specific tasks. At present, the retinal blood vessel
Review of related studies
segmentation technology is divided into two categories— unsupervised methods and supervised methods [6,7]—according to whether the retinal blood vessels are compared with the standard image at the time of segmentation. Unsupervised learning methods establish segmentation models based on unmarked image features, which can be roughly divided into matching filter methods [8], vascular tracing based segmentation methods [9] and model-based segmentation methods [10]. The matching filter segmentation method convolves the constructed filter with the image to be segmented to achieve the purpose of matching segmentation. A two-dimensional matched filtering algorithm proposed by Chaudhuri et al. [11] used a Gaussian curve to fit the gradation characteristics of the cross-section of the blood vessel and generate a matched filter in 12 directions to detect blood vessels. Roychowdhury et al. [12] proposed a fundus image segmentation method based on the feature extraction of the main vessel and sub-image classification. The method obtains a binary image through a high-pass filter and simultaneously reconstructs the enhanced image of the vascular region. In binary image extraction, the common region is extracted as the main blood vessel; the remaining image pixels are classified using a Gaussian Mixture Model (GMM) classifier. This method was suitable for a variety of imaging techniques as long as the pixel intensity distribution was close to the gaussian type. Retinal vessels were segmented by fuzzy cmeans in Ref. [33]. In order to face uneven illumination and contrast, phase consistency was first performed, which preserved features with in-phase frequency components (such as edges) while suppressing other features. Therefore, accurate segmentation can also be performed when there is a decrease in intensity and a change in illumination in the image. Jiong et al. [13] proposed a powerful automatic retinal vessel segmentation method and proposed a new filter based on directional fractional 3D rotation. This method obtains binary segmentation by feature system analysis and threshold processing of the Hessian matrix. According to the continuity of blood vessels, Can et al. [14] developed an adaptive tracking method to segment blood vessels. After this, Zhang et al. [15] proposed a vascular tracking method based on multi-scale line detection and Bayesian theory. They divided the pixel points in the fundus image into blood vessel branch points, blood vessel intersections, general blood vessel points, and proposed four major categories of non-vascular points. Literature [34] presented a VLSI implementation of a retina vessel segmentation system while exploring various parameters that affect the power consumption, accuracy and performance of the system. The proposed design implements an unsupervised vessel segmentation algorithm that utilizes matched filtering with signed integers to enhance the difference between the blood vessels and the rest of the retina. The design accelerates the process of obtaining a binary map of the vessel tree by using parallel processing and efficient resource sharing, achieving real-time performance. It achieved an efficient balance between performance, power consumption, and accuracy. The performance of the supervised method is usually better than that of the unsupervised method. The supervised method generally classifies pixels to achieve the purpose of segmentation [16]. This method often uses the extracted feature vector
Please cite this article in press as: Tian C, et al. Multi-path convolutional neural network in fundus segmentation of blood vessels. Biocybern Biomed Eng (2020), https://doi.org/10.1016/j.bbe.2020.01.011
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138
BBE 422 1–13 biocybernetics and biomedical engineering xxx (2019) xxx–xxx
139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198
to train the classifier to achieve the recognition of the vascular and non-vascular pixels. Franklin et al. [17] proposed dividing the image pixels into vascular or non-vascular points according to the size of the retinal blood vessels and then using the multi-layered perceptual neural network to identify and segment the retinal vessels. Marin et al. [18] proposed a new method for monitoring the detection of digital retinal blood vessel segmentation. The method used a neural network for pixel classification and computes a vector consisting of pixel representation features based on grayscale and local variables. Li et al. [19] presented a new supervised method for vessel segmentation in retinal images. This method remolds the task of segmentation as a problem of cross-modality data transformation from retinal image to vessel map. A wide and deep neural network with strong induction ability was proposed to model the transformation, and an efficient training strategy was presented, make training easy and convenient. Memari et al. [20] proposed an automatic retinal vessel segmentation method combining matched filtering technology and adaptive enhancement classifier. This method firstly uses morphology to enhance the fundus image, adopts histogram equalization to enhance the contrast, then uses the combination of BCOSFIRE and Frangi matched filters to enhance the blood vessels, finally calculates different statistical features based on pixels, and extracts the network structure inside the image in AdaBoost classifier. The results were good for DRIVE, STARE, and CHASE_DB1 data sets, but the average processing time per image was 8.2 minutes. In recent years, deep learning has become as an artificial intelligence research field, especially convolutional neural networks (CNNs) have rapidly become the preferred tool for medical image analysis, in image classification, target detection [21], segmentation and registration [22] have been widely used. HED (Holistically Nested Edge Detection) is a typical network of the convolution [35] proposed by Xie, were used to complete the natural image Edge Detection task, has a very good effect. The segmentation results output by HED network have good vascular structure and no burr, which was suitable for retinal image vascular segmentation in different data sets. However, the comparison between the final fusion result and the manually annotated image shows that the width of the blood vessels in the segmentation result output by the network is generally thicker than that of the manually annotated result, and the very small blood vessels in the most central area are not easy to be segmented. Compared with the traditional neural network, convolutional neural network is still a hierarchical network, but the functions and forms of layers have changed. It mainly includes the Input Layer, Convolutional Layer, ReLU Layer, Pooling Layer and Full Convolution Layer [23]. Zilly et al. [24] proposed a new CNN for fundus retinal disk segmentation, which achieved better segmentation performance than the existing methods in the public DRISHTI-GS dataset. Peng et al. [25] proposed a new Unet-like network, CDNet, based on u-net [26]. This method uses the new operation of blocking to learn finer feature mapping while keeping the resolution constant. The model has achieved good performance in the segmentation of the DRIVE database. Yan et al. [27] proposed a new segment-level loss that emphasizes the thickness consistency of thin vessels in the training process. By jointly adopting both the segment-
3
level and the pixel-wise losses, the importance between thick and thin vessels in the loss calculation is more balanced. As a result, more effective features can be learned for vessel segmentation without increasing the overall model complexity. In addition, a retinal vascular segmentation method, based on deep learning and traditional methods [28] is proposed. This method fuses the vessel probability map obtained at the end of the FCN (Fully Convolutional Network) network with the vessel probability map of the superficial information HNED (Holistically Nested Edge Detection), and then obtains the required retinal vessel segmentation map. However, if only by increasing the depth of the convolutional neural network to improve the classification accuracy, it may lead to overfitting, increasing the training time of the network, and inaccurate segmentation results. In summary, although the above method can realize segmentation of the retinal blood vessel, due to the optical design defect of the fundus camera, a large amount of noise often exists in the collected fundus image, and if the image is not filtered before the segmentation image, there is noise in the segmentation result. Moreover, the above methods are still not ideal for identifying small blood vessels, and the accuracy and sensitivity need to be improved. Therefore, this paper proposes a fundus image segmentation method based on a multi-path convolutional neural network.
3.
Principle of the algorithm
In the previous literature [26], U-Net is based on FCN (Fully Convolutional Neural Network) for improvement and uses data augmentation to train some relatively small samples of data, especially medical-related data. The whole neural network is mainly composed of two parts: contracting path and expanding path. The network can be trained in pictures of fewer data sets and can get better results. The U-net network is characterized in that the contraction network and the expansion network are mutually mapped. In the process of expansion, the missing boundary information is complemented by combining the contracted layer features mapped thereto, thereby improving the accuracy of the predicted edge information. Due to the slight deficiency of the u-net network, the segmentation of the retinal fine blood vessels is not obvious. In view of the shortcomings of this network, this paper proposes the idea of frequency division and replaces the standard convolution with dilated convolution, which expands the receptive field and does not introduce additional parameters. When setting different dilation rates, the receptive field will be different, that is, acquired. Multi-scale information improves the ability of the network to extract vascular feature information, thereby enabling more segmentation of tiny blood vessels. At the same time, in order to solve the problem of the performance degradation of deep convolutional neural networks under extreme depth conditions, the residual network is added to further improve the network structure. When the visual system transmits external information to the brain, it does not simply process specific visual information, but processes all kinds of visual information in parallel. Similarly, the high-frequency and low-frequency information
Please cite this article in press as: Tian C, et al. Multi-path convolutional neural network in fundus segmentation of blood vessels. Biocybern Biomed Eng (2020), https://doi.org/10.1016/j.bbe.2020.01.011
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224
225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255
BBE 422 1–13
4
biocybernetics and biomedical engineering xxx (2019) xxx–xxx
256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285
in an image means different things in the image. The lowfrequency information is the main information that contains the whole image; it forms the basic gray level of the image, which determines the basic structure of the image. The highfrequency information forms the edge and details of the image, which further supplements the content of the lowfrequency feature information of the image. Therefore, in this paper, the traditional method of Gaussian matched filtering is used to enhance the small blood vessels of the retina. The lowfrequency feature information acquired using a Gaussian lowpass filter and the high-frequency feature information acquired using a Gaussian high-pass filter were used as inputs to the multi-path convolutional neural network, and the lowfrequency and high-frequency feature extraction paths are constructed accordingly. The low-frequency feature extraction path, which is mainly composed of the convolution and sampling modules, is used to quickly extract the general feature information from the retinal blood vessel images. The high-frequency feature extraction path comprised the coding region and the decoding region, which are mainly composed of the convolution and sampling modules. The feature map in the coding region is store and integrate into the corresponding decoding region for the purpose of accurately locating the retinal blood vessels. Finally, the low-frequency segmentation map and high-frequency segmentation map are extracted through the low-frequency and high-frequency feature extraction paths constructed in the convolutional neural network, and the final segmentation maps are obtained through image information fusion. The principle of the method proposed in this paper is shown in Fig. 1.
286
3.1.
287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303
Considering that the low-frequency image mainly reflects the global contour information of the fundus image, and has the attribute of low-dimensional features, the final segmentation precision can be improved by extracting the low-frequency feature information of the image. The high-frequency image mainly reflects the detailed information of the image, and further strengthens the image content based on the lowfrequency information. Therefore, the low-frequency and high-frequency information of the fundus image is extracted by the Gaussian low-pass filter and Gaussian high-pass filter, respectively. The overall contour information and local detail features of the image are obtained, and then further processed by combining the low-frequency feature path and the highfrequency feature path. After the original fundus image f(x,y) is convoluted with an impulse response function h(x,y), and the image g(x,y) is obtained as expressed in Eq. (1):
304
gðx; yÞ ¼ f ðx; yÞhðx; yÞ
305 306 307 308 309
The transfer functions for the Gaussian low-pass filter (GLPF) and Gaussian high-pass filter (GLPF) are Hl ðu; vÞ and Hh ðu; vÞ, respectively, as expressed in Eqs. (2) and (3):
310
Hl ðu; vÞ ¼ eððD
Global and local detail extraction
2
ðu;vÞÞ=ð2D20 ÞÞ
(1)
(2)
312 311 313
Hh ðu; vÞ ¼ 1eððD
2
ðu;vÞÞ=ð2D20 ÞÞ
(3)
where Dðu; vÞ denotes the distance to the center point of the Fourier transform and D0 denotes the cutoff frequency.
314 315 316
Low-frequency component Gl ðu; vÞ and the high-frequency component Gh ðu; vÞ can be obtained from the convolutional relationship, as shown in Eq. (4):
317 318 319 320
Gi ðu; vÞ ¼ Fðu; vÞHi ðu; vÞ
i ¼ l; h
(4)
The low-frequency and high-frequency components are transformed inverse Fourier, and filtered low-frequency image gl(x, y) and high-frequency image gh(x, y) are obtained, as shown in Eq. (5):
322 321 323 324 325 326 327
gi ðx; yÞ ¼ z
1
ðFðu; vÞHi ðu; vÞÞ
i ¼ l; h
(5)
where z1 represents the inverse Fourier transform.
329 328
3.2.
330
Low-frequency feature extraction path
Neural networks can simulate the human brain and process the input data hierarchically, and each layer in a neural network has its own feature extraction task. In addition, CNNs can better simulate the process of information transfer between visual cortex cells, which corresponds to the process of redundancy removal during the extraction of general features from the visual information. Based on this characteristic, a quick detection path is constructed to directly extract the low-frequency feature information contain in the images through convolution and sampling. The convolutional layer is used to extract the local features of the images. To increase the speed of perception, the convolutional layer do not output the result directly, but performed computation by means of residual mapping, which could reduce the number of network parameters and shorten the training time to a certain extent. The pooling layer is mainly used to reduce the dimension of the feature map and achieve secondary feature extraction. This path also restore the feature map to its original dimensions through upsampling to achieve the objective of end-to-end segmentation. As shown in Fig. 2, this path comprises one convolutional layer and four blocks; the convolutional layer has a size of 3 3 and there are 32 convolutional kernels in total. Each block comprises two convolutional blocks, one down-sampling module, and one up-sampling module. Each convolutional block is composed of two 3 3 convolutional layers and has the same number of convolutional kernels. The numbers of convolutional kernels in these four blocks are 64, 128, 64, and 32, respectively. The down-sampling module performs maxpooling and the up-sampling module employs a bilinear interpolation algorithm. A low-frequency feature map F1 is obtained by inputting the low-frequency image to this path.
331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362
3.3.
363
High-frequency feature extraction path
When the visual system processes complex information, each kind of nerve cell (except the neuroglia cells, such as support and nerve cells) in the visual system has its own receptive field, and the larger its perceptive field is, the more useful information the nerve cell will acquire [29]. Inspired by this,
Please cite this article in press as: Tian C, et al. Multi-path convolutional neural network in fundus segmentation of blood vessels. Biocybern Biomed Eng (2020), https://doi.org/10.1016/j.bbe.2020.01.011
364 365 366 367 368
BBE 422 1–13 biocybernetics and biomedical engineering xxx (2019) xxx–xxx
5
Fig. 1 – Principle of the method for blood vessel segmentation in fundus images based on a multi-path convolutional neural network.
369 370 371 372 373 374 375 376 377 378 379 380
this paper pays more attention to details in the path of constructing high-frequency images that represent local information, mainly capturing more details of different layers.Therefore, this path is composes of dilated convolutional blocks for the purpose of enlarging the receptive field, connecting more information elements, allowing the information point in each direction to fully perform its function in feature detection, and thus acquire more high-frequency feature information. Considering the efficiency of network training, a path composed of two convolutional layers and seven blocks are constructed, as shown in Fig. 3. Each convolutional layer
consists of 32 3 3 convolutional layers. The seven blocks consist of the coding region and decoding region. Each block in the coding region is composed of two dilated convolutional blocks and one down-sampling module, which is used to gradually reduce the spatial dimension of the image through down-sampling and extract the general features of blood vessel distribution in the fundus image. Each block in the decoding region is composed of two dilated convolutional blocks and one up-sampling module, which is used to gradually restore the image's detail and spatial dimension through up-sampling for the purpose of accurately locating the blood vessels. The dilated convolutional blocks consist of two
Fig. 2 – Structure diagram of low-frequency feature extraction path. Please cite this article in press as: Tian C, et al. Multi-path convolutional neural network in fundus segmentation of blood vessels. Biocybern Biomed Eng (2020), https://doi.org/10.1016/j.bbe.2020.01.011
381 382 383 384 385 386 387 388 389 390 391 392
BBE 422 1–13
6
biocybernetics and biomedical engineering xxx (2019) xxx–xxx
Fig. 3 – Structure diagram of high-frequency feature extraction path.
393 394 395 396 397 398 399 400 401 402 403 404 405
3 3 dilated kernels (having the same number of dilated kernels) and there are two dilated kernels in each dilated block. The down-sampling module performs max-pooling and the up-sampling module employs a bilinear interpolation algorithm. The numbers of dilated kernels in the seven blocks are 64, 128, 256, 512, 256, 128, and 64, respectively. To achieve more detailed segmentation, the feature maps in the coding region of this path are store and integrate into the corresponding decoding region, which improve the precision of the image segmentation network while reducing the operating time of the network during training and testing. A high-frequency feature map Fh is obtained by inputting the high-frequency image into this path.
406
3.4.
407 408 409
The outputs of the high and low-frequency channels are Fh and Fl, respectively. The output after concat calculation is shown in Eq. (6):
423 424 425 426 427 428 429 430 431 432 433 434 435 436
Multi-path fusion coding
410 Fi ¼ signðdiÞFhd-1-reluðd1iÞ þ signðid þ 1ÞFlreluði-dÞ 412 411 413 414 415 416 417 418 419 420 421 422
low-frequency feature extraction path; Fig. 4(b) represents an overall output segmentation diagram of the high-frequency feature extraction path; Fig. 4(c) represents low-frequency and high-frequency extraction paths fusion map; Fig. 4(d) shows a detailed view of the rectangle in Fig. 4(a)–(c). From Fig. 4(a), it can be observed that there are many misses in area 0 and area 3, but the main contour is relatively clear. From Fig. 4(b), it can be observed that small blood vessels are detected in area 1 and area 4, but the main contour is relatively vague. The fusion operation of Fig. 4(a) and (b) is performed to obtain the overall perception and partial response results as shown in Fig. 4(c). It can be clearly seen from Fig. 4(d) that the feature extraction effect after fusion is better and the segmentation is more accurate.
(6)
where d represents the number of channels in the high-frequency feature graph before fusion, i ¼ 0; 1; 2; . . .; d, Fh is the feature graph output by the high-frequency feature extraction path, Fl is the feature graph output by the low-frequency feature extraction path, sign is the signal function, when the input value is not positive, 0 is output, the input value is positive, 1 is output, and relu is the activation function of Re Lu. Fi represents the ith channel of the fused image. The blood vessel segmentation effect map of the multipath convolutional neural network is shown in Fig. 4. Wherein, Fig. 4(a) represents an output segmentation map of the
4.
Network training
4.1.
Experimental setup
First, the initial cutoff frequency of the Gaussian high-pass filter is set to 30 Hz according to the literature [30,31], and the cutoff frequency of the low-pass filter is set to 200 Hz. Then the effect of the image observed by the naked eye is adjust up and down to obtain the optimal parameters. Based on 12 sets of experiments, it is known that when the cutoff frequencies of GLPF and GHPF are 150 and 10, respectively, the best low-frequency images and high-frequency images are obtained. Second, the filtered low- and high-frequency images are input into the low-frequency feature extraction path and high-frequency feature extraction path, respectively. The obtained feature maps Fl and Fh are fused (addition of corresponding paths) to obtain a 64-path feature map, and a
Please cite this article in press as: Tian C, et al. Multi-path convolutional neural network in fundus segmentation of blood vessels. Biocybern Biomed Eng (2020), https://doi.org/10.1016/j.bbe.2020.01.011
437 438
439 440 441 442 443 444 445 446 447 448 449 450 451
BBE 422 1–13
7
biocybernetics and biomedical engineering xxx (2019) xxx–xxx
Fig. 4 – The blood vessel segmentation effect map of the multipath convolutional neural network. (a) Segmentation map output by low-frequency feature extraction path; (b) segmentation map output by high-frequency feature extraction path; (c) low-frequency and high-frequency extraction paths fusion map; (d) detail view of the rectangles in (a)–(c).
452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469
corresponding feature map F' is generated using 1 1–32 convolutional kernels. F0 is convert to a single-path feature map F00 using a 1 1–1 convolutional kernel, and using the Re Lu activation function, the final blood vessel segmentation map corresponding to the original image f(x, y) is obtained. This paper uses a 1080ti graphics card for training and testing, and the software environment is PyTorch. During the training of the network, all input images and labels are randomly deducted from the patch with blood vessels and then scaled to 512 512. The initial learning rate is set to 0.01 and is reduce to 90% of the original learning rate after every 20 cycles (epoch). After 200 cycles, the network reached convergence and stopped training. The entire training time is about half an hour. The working flow chart of its multipath convolutional neural network is shown in Fig. 5. The box on the left represents the training model of multi-path convolutional neural network, and the right represents the output model after the training.
470
4.2.
471 472
In this paper, the optimization function of stochastic gradient descent is used to optimize the parameters of high- and
Loss function
low-frequency feature extraction paths. In binary classification tasks, cross-entropy is usually used as the loss function. However, retinal blood vessel segmentation is a binary classification task in which positive and negative sample points are extremely unbalanced, most pixel points are background pixels (non-vessel points), and only a small number of pixel points are positive sample points (vessel points). Vessel points are positive sample points and nonvessel points are negative sample points. For this reason, a weighted cross-entropy is used as the loss function in this paper. This loss function is recorded as loss, and expressed by Eq. (7):
473 474 475 476 477 478 479 480 481 482 483 484 485
loss ¼ labelðlogðSðMÞÞÞv þ ð1labelÞðlogð1SðMÞÞÞ
(7)
where label is the sample's label, SðÞ denotes the sigmoid function, M denotes the network's output, and weight v is usually defined as the ratio of negative sample points to positive sample points. Assignment of a larger weight can improve the network's true positive rate, reduce the miss rate, and enable the network to converge more effectively.
Please cite this article in press as: Tian C, et al. Multi-path convolutional neural network in fundus segmentation of blood vessels. Biocybern Biomed Eng (2020), https://doi.org/10.1016/j.bbe.2020.01.011
486 487 488 489 490 491 492
BBE 422 1–13
8
biocybernetics and biomedical engineering xxx (2019) xxx–xxx
Fig. 5 – Multi-path network workflow flowchart.
493
5.
Results and discussion
5.1.
Database
494
495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518
In this paper, two common databases, DRIVE and CHASE_DB1, are used as experimental subjects. The DRIVE database is from a screening project for diabetic retinopathy in the Netherlands. The screening population included 400 diabetes patients between the ages of 25 and 90. The researchers randomly selected 40 images as the data of the entire data set, 7 of which are lesion images and the rest are normal non-lesion images, and the resolution of each image is 565 584. Among them, 20 are training images, 10 are validation images and 10 are test images. The CHASE_DB1 image library is a subset of retinal images extracted from the children's heart and health research center (CHASE). Each image has a pixel of 999 960 and a total of 28 images, including 14 retinal images of the left eye and 14 retinal images of the right eye. The training set contains 20 images, the validation set contains 4 images, and the testing set contains 4 images. For the two experts with manual word segmentation, this paper chooses the first expert manual annotation as the word segmentation standard. Compared with the DRIVE database, CHASE_DB1 does not uniformly standardize the background brightness of all fundus images. The contrast between the background and the vascular target is relatively low, and the applicability and robustness of the retinal vessel segmentation algorithm can be detected.
519 520 521 522
5.1.1.
DRIVE database
Five fundus images with different characteristics are selected from 20 images in the test set of the DRIVE database for comparative analysis of the corresponding experiment results.
The first image is a fundus image with high contrast, the second is an image of the fundus with the disease, the third is a fundus image with low contrast, the fourth is an image of fundus with complex blood vessels, and the fifth is a fundus image similar to the fourth image. To facilitate horizontal comparison of the methods, the experimental results of the retinal image vessel segmentation method based on machine learning are presented. As shown in Fig. 6, the first row contains the original images in the image library, the second row contains the standard maps of blood vessels manually segmented by the first expert, the third row contains the segmentation maps obtained using the method proposed in paper [12], and the fourth row contains the segmentation maps obtained using the method proposed in this paper.
523 524 525 526 527 528 529 530 531 532 533 534 535 536
5.1.2.
537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555
CHASE_DB1 database
Fig. 7 shows four fundus images selected from the test set in the CHASE_DB1 database. The first row contains the original fundus images, the second row contains the manual segmentation maps (standard reference images), the third row contains the segmentation maps obtained using the algorithm proposed in paper [19], and the fourth row contains the segmentation maps obtaining using the algorithm proposed in this paper. From the results of Figs. 6 and 7, it can be concluded that Refs. [12] and [19] retain part of the noise while extracting small blood vessels, and lose the connectivity of some blood vessels, which is not ideal for the segmentation of small blood vessels. Also, there are more vascular breaks. Compared with the contrast methods, the method proposed in this paper can effectively suppress the noise, ensure the continuity of the blood vessel segmentation while segmenting the small blood vessels, has a good generalization effect, and can be applied to blood vessel segmentation under various conditions.
Please cite this article in press as: Tian C, et al. Multi-path convolutional neural network in fundus segmentation of blood vessels. Biocybern Biomed Eng (2020), https://doi.org/10.1016/j.bbe.2020.01.011
BBE 422 1–13 biocybernetics and biomedical engineering xxx (2019) xxx–xxx
9
Fig. 6 – DRIVE dataset segmentation results. 1st row: original images in the library; 2nd row: standard maps obtained by manual segmentation by the first expert; 3rd row: segmentation maps of the method proposed in paper [12]; 4th row: segmentation maps of the method proposed in this paper.
556
5.2.
557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575
The segmentation results of the DRIVE validation set and the CHASE_DB1 validation set are quantitatively evaluated. The first expert manually divides the graph into the gold standard, and three performance indicators are used to measure the method. Among them, the accuracy (Acc) represents the proportion of the pixel points that assign the correct pixel to the sum of the pixels of the entire image. The sensitivity (SE) is the ratio of the blood vessel points that represent the correct assignment to the sum of the gold standard blood vessel points. Specificity (SP) is the ratio of the background points representing the correct assignment to the sum of the gold standard background points. In addition to the abovementioned evaluation index ROC (receiver operating characteristic curve) is also an important indicator for measuring the accuracy of blood vessel segmentation; it is a curve with SP as the abscissa and SE as the ordinate. The area under the curve is the Auc (area under the curve), and its value is an important indicator for measuring the performance of blood vessel segmentation. The closer the Auc value is to 1, the better the
Quantitative analysis
performance of the corresponding blood vessel segmentation method. Therefore, this paper also includes the Auc value in the evaluation index. To evaluate the proposed multi-path convolutional neural network, we conducte experiments on the DRIVE and CHASE_DB1 datasets and compare them with the current state-of-the-art methods. Subjective and objective comparison results produced by different methods are shown in Table 1 and Figs. 8 and 9. For the DRIVE dataset, the proposed network achieves 0.9580, 0.8639, 0.9665, and 0.9560 for Acc, SE, SP, and Auc, respectively. Compared with other state-of-the-art methods, the proposed method achieves the best SE results, while maintaining a good Acc score. In this paper, the results of retinal vascular segmentation are further quantitatively compared. It can be seen from the results in Fig. 8 that although the accuracy of the Acc performance index that appears in Ref. [18], most of the values are between 0.945 and 0.958, the box is longer, and the stability is poor. In contrast, the minimum Acc value in our method is 0.9531, and most of the values are between 0.955 and 0.963. Reference [17] divides
Please cite this article in press as: Tian C, et al. Multi-path convolutional neural network in fundus segmentation of blood vessels. Biocybern Biomed Eng (2020), https://doi.org/10.1016/j.bbe.2020.01.011
576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596
BBE 422 1–13
10
biocybernetics and biomedical engineering xxx (2019) xxx–xxx
Fig. 7 – Chase_DB1 dataset segmentation results. 1st row: original fundus images; 2nd row: manual segmentation maps (standard reference images); 3rd row: segmentation maps of the algorithm proposed in paper [19]; 4th row: segmentation maps of the algorithm proposed in this paper.
597 598 599 600 601 602 603 604 605 606 607 608
the image pixels into vascular points and non-vascular points according to the size of the retinal blood vessels, and then uses the multilayer perceptual neural network to identify and segment the retinal blood vessels. This approach improves the segmentation performance indexes SE and SP to some extent, but its accuracy is slightly lower. Reference [25] proposed a model called CDNet, based on the U-net network, which achieved good results on the DRIVE database, but its general performance could not be verified. Reference [27] does not increase the complexity of the overall model, and considers the balance between thick blood vessels and thin blood vessels. The obtained performance index SP and Auc maxi-
mum values are better than the algorithm in this paper, but the box is longer. Also, the Acc and SE of the algorithm are lower than the algorithm in this paper. In general, the box of the method performance index of this method is shorter, with better stability and robustness, and improved SE. Fig. 9 is a comparison of the segmentation performance indicators of the different methods on the CHASE_DB01 data set, wherein Fig. 9(a) is the ROC curve and Fig. 9(b) is the segmentation SE for the CHASE_DB1 dataset. It can be seen from the ROC curve of Fig. 9(a) that the ROC curve of the literature [27] is more biased toward the upper left area of the coordinate system. As it can be seen from Table 1,
Please cite this article in press as: Tian C, et al. Multi-path convolutional neural network in fundus segmentation of blood vessels. Biocybern Biomed Eng (2020), https://doi.org/10.1016/j.bbe.2020.01.011
609 610 611 612 613 614 615 616 617 618 619 620
BBE 422 1–13
11
biocybernetics and biomedical engineering xxx (2019) xxx–xxx
Table 1 – Comparison of average performance of different methods in the DRIVE and CHASE_DB1 datasets. Dataset
Methods
Year
Acc
SE
SP
Auc
DRIVE
Marin et al. [18] Franlin et al. [17] Roychowdhury et al. [12] Li et al. [19] Memari et al. [20] Shao et al. [25] Yan et al. [27] Proposed
2011 2014 2014 2016 2017 2018 2018 2019
0.9452 0.9408 0.9519 0.9527 0.9722 0.9629 0.9542 0.9580
0.7067 0.7753 0.7249 0.7569 0.8726 0.8112 0.7653 0.8639
0.9801 0.9311 0.9830 0.9816 0.9884 0.9843 0.9843 0.9818 0.9818 0.9690
0.9588 0.7628 0.9620 0.9738 0.9795 0.9841 0.9752 0.9560
CHASE_DB1
Marin et al. [18] Franlin et al. [17] Roychowdhury et al. [12] Li et al. [19] Memari et al. [20] Shao et al. [25] Yan et al. [27] Proposed
2011 2014 2014 2016 2017 2018 2018 2019
– – 0.9530 0.9581 0.9482 – 0.9610 0.9601
– – 0.7201 0.7507 0.8192 – 0.7633 0.8778
– – 0.9824 0.9793 0.9591 – 0.9809 0.9680
– – 0.9532 0.9716 0.9482 – 0.9718 0.9577
Fig. 8 – Box plot statistics for different evaluation results of the evaluation indicators for the DRIVE dataset.
621 622 623 624 625 626
the Auc value of the algorithm in this paper is 0.9577, which is slightly lower than 0.971 in the literature [27]. However, as it can be seen from Fig. 9(b), the sensitivity of the proposed algorithm is significantly better than that of the comparison method. From Table 1, the sensitivity of the literature [12], the literature [19] and the literature [27] are 0.7201, 0.7507, and
0.7633, respectively. The sensitivity of this paper is 0.8778, which is significantly higher than several other comparison methods. In contrast, the sensitivity of the proposed method is better, which further proves that the segmentation results of this method have obvious advantages, and the overall segmentation effect is better.
Please cite this article in press as: Tian C, et al. Multi-path convolutional neural network in fundus segmentation of blood vessels. Biocybern Biomed Eng (2020), https://doi.org/10.1016/j.bbe.2020.01.011
627 628 629 630 631 632
BBE 422 1–13
12
biocybernetics and biomedical engineering xxx (2019) xxx–xxx
Fig. 9 – Comparison of the performance of different retinal blood vessel segmentation algorithms. (a) ROC curves for retinal blood vessel segmentation in fundus images and (b) segmentation sensitivity for the CHASE_DB1 dataset.
633
634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667
6.
Conclusion
Medical image processing based on computer-aided diagnosis techniques has been a focal area of research in recent years. Computerized image processing technologies have been used to analyze and process fundus images, segment retinal blood vessels in fundus images, and achieve the objective of computer-aided disease diagnosis. This process has significantly reduced repeated work and misdiagnosis rates of physicians. A method for retinal blood vessel segmentation in fundus images based on the multi-path convolutional neural network is proposed in this paper. First, the original image is pre-processed. Using a Gaussian low-pass filter and Gaussian high-pass filter, the low-frequency image containing global feature information and the high-frequency image containing local feature information are obtained, respectively. Then the pre-processed images are input into the constructed multi-path convolutional neural network, and the final segmentation map is obtained. This paper takes the extraction of retinal images of complicated or small lesions as an example, the neural coding mechanism for multi-path information processing by the primary visual cortex is simulated, and the retinal blood vessels in different images are segmented using the constructed multi-path convolutional neural network. Under the condition of ensuring good training time, this method has the advantage of accurate segmentation, the method proposed in this paper is superior to other blood vessel segmentation methods that retain certain noise during segmentation or achieve less remarkable segmentation results. It shows the potential of application in biomedical image processing and provides a new idea for subsequent image processing and analysis. However, its Acc has no obvious advantage, and its SP is low, which may be caused by some background segmentation errors. In addition, the data set selected by the algorithm in this paper is limited
667 668
and its generalization cannot be guaranteed. Further research is still needed in later work.
Uncited reference
Q3
669
[32].
670
references
671
[1] Zhu CZ, Zou BJ, Zhao RC, et al. Retinal vessel segmentation in colour fundus images using extreme learning machine. Comput Med Imag Graph 2017;(55):68–77. [2] Jiang ZX, Juan Y, Sen A, et al. Fast, accurate and robust retinal vessel segmentation system. Biocybern Biomed Eng 2017;37(3):412–21. [3] Xie S, Nie H. Retinal vascular image segmentation using genetic algorithm plus fcm clustering. Third International Conference on Intelligent System Design & Engineering Applications; 2013. [4] Chakraborti T, Jha DK, Chowdhury AS, et al. A self-adaptive matched filter for retinal blood vessel detection. Mach Vis Appl 2015;26(1):55–68. [5] Soomro TA, Khan TM, Khan MAU, et al. Impact of ICA-based image enhancement technique on retinal blood vessels segmentation. IEEE Access 2018;3524–30. [6] Vlachos M, Dermatas E. Multi-scale retinal vessel segmentation using line tracking. Comput Med Imag Graph 2010;34(3):213–27. [7] Fraz MM, Remagnino P, Hoppe A, et al. An ensemble classification-based approach applied to retinal blood vessel segmentation. IEEE Trans Biomed Eng 2012;59(9):2538–48. [8] Al-Rawi M, Qutaishat M, Arrar M. An improved matched filter for blood vessel detection of digital retinal images. Comput Biol Med 2007;37(2):262–7. [9] Yin Y, Adel M, Bourennane S. Retinal vessel segmentation using a probabilistic tracking method. Pattern Recogn 2012;45(4):1235–44.
Please cite this article in press as: Tian C, et al. Multi-path convolutional neural network in fundus segmentation of blood vessels. Biocybern Biomed Eng (2020), https://doi.org/10.1016/j.bbe.2020.01.011
Q4
672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700
BBE 422 1–13 biocybernetics and biomedical engineering xxx (2019) xxx–xxx
700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745
[10] Kaba D, Salazar-Gonzalez AG, Li Y, et al. Segmentation of retinal blood vessels using Gaussian mixture models and expectation maximisation. Lecture Notes Comput Sci 2013;7798:105–12. [11] Chaudhuri S, Chatterjee S, Katz N, et al. Detection of blood vessels in retinal images using two dimensional matched filters. IEEE Trans Med Imag 1989;8(3):263–9. [12] Roychowdhury S, Koozekanani DD, Parhi KK. Blood vessel segmentation of fundus images by major vessel extraction and subimage classification. IEEE J Biomed Health Inform 2014;19(3):1118–28. [13] Zhang J, Dashtbozorg B, Bekkers E, et al. Robust retinal vessel segmentation via locally adaptive derivative frames in orientation scores. IEEE Trans Med Imag 2016;35(12):2631–44. [14] Can A, Shen H, Turner JN, et al. Rapid automated tracing and feature extraction from retinal fundus images using direct exploratory algorithms. IEEE Trans Inform Technol Biomed 1999;3(2):125–38. [15] Zhang J, Li H, Nie Q, et al. A retinal vessel boundary tracking method based on Bayesian theory and multiscale line detection. Comput Med Imag Graph 2014;38(6):517–25. [16] You XG, Peng MQM, Yuan Y, et al. Segmentation of retinal blood vessels using the radial Gprojection and semisupervised approach. Pattern Recogn 2011;44(10):2314–24. [17] Franklin SW, Rajan SE. Computerized screening of diabetic retinopathy employing blood vessel segmentation in retinal images. Biocybern Biomed Eng 2014;34(2):117–24. [18] Marin D, Aquino A, Bravo JM, et al. A new supervised method for blood vessel segmentation in retinal images by using gray-level and moment invariants-based features. IEEE Trans Med Imag 2011;30(1):146–58. [19] Li Q, Feng B, Xie LP, et al. A cross-modality learning approach for vessel segmentation in retinal images. IEEE Trans Med Imag 2016;35(1):109–18. [20] Memari N, Ramli AR, Saripan MIB, et al. Supervised retinal vessel segmentation from color fundus images based on matched filtering and AdaBoost classifier. PLOS ONE 2017;12(12):e0188939. [21] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation. IEEE Trans Pattern Anal Mach Intell 2014;39(4):640–51. [22] Dai L, Fang R, Li HT, et al. Clinical report guided retinal microaneurysm detection with multisieving deep learning. IEEE Trans Med Imag 2018;37(5):1149–61. [23] Ngo L, Han JH. Advanced deep learning for blood vessel segmentation in retinal fundus images. Fifth International
[24]
[25]
[26]
[27]
[28]
[29]
[30] [31]
[32]
[33]
[34]
[35]
13
Winter Conference on Brain–Computer Interface (BCI); 2017. Zilly JG, Buhmann JM, Mahapatra D. Boosting convolutional filters with entropy sampling for optic cup and disc image segmentation from fundus images. International Workshop on Machine Learning in Medical Imaging (MICCAI). 2015. pp. 136–43. Peng S, Zheng CX, Xu F, et al. Blood vessels segmentation by using cdnet. 2018 Third IEEE International Conference on Image, Vision and Computing; 2018. pp. 305–10. Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and ComputerAssisted Intervention; 2015. pp. 234–41. Yan Z, Yang X, Cheng KTT. Joint segment-level and pixelwise losses for deep learning based retinal vessel segmentation. IEEE Trans Biomed Eng 2018;65(9):1912–23. Paul R, Hawkins SH, Hall LO, et al. Combining deep neural network and traditional image features to improve survival prediction accuracy for lung cancer patients from diagnostic ct. 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC); 2016. pp. 2570–5. Hofmanninger J, Langs G. Mapping visual features to semantic profiles for retrieval in medical imaging. IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2015. pp. 457–65. Li LJ, Chew ZJ. Printed circuit board based memristor in adaptive lowpass filter. Electron Lett 2012;48(25):1610–1. Yu YB, Yang NJ, Yang CY, et al. Memristor bridge-based low pass filter for image processing. J Syst Eng Electron 2019;30 (3):448–55. Owen CG, Rudnicka AR, Mullen R, et al. Measuring retinal vessel tortuosity in 10-year-old children: validation of the computer-assisted image analysis of the retina (CAIAR) program. Invest Opthalmol Vis Sci 2009;50(5):2004–10. Mapayi T, Tapamo JR, Viriri R. Retinal vessel segmentation: a comparative study of fuzzy C-means and sum entropy information on phase congruency. Int J Adv Robot Syst 2015;12(9):133. Koukounis D, Ttofis C, Papadopoulos A, et al. A high performance hardware architecture for portable, lowpower retinal vessel segmentation. Integration the VLSI Journal 2014;47(3):377–86. Xie S, Tu Z. Holistically-nested edge detection. Proceedings of the IEEE International Conference on Computer Vision; 2015. p. 1395–403.
746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792
Please cite this article in press as: Tian C, et al. Multi-path convolutional neural network in fundus segmentation of blood vessels. Biocybern Biomed Eng (2020), https://doi.org/10.1016/j.bbe.2020.01.011