Abnormality Detection in Retinal Image by Individualized Background Learning
Journal Pre-proof
Abnormality Detection in Retinal Image by Individualized Background Learning Benzhi Chen, Lisheng Wang, Xiuying Wang, Jian Sun, Yijie Huang, Dagan Feng, Zongben Xu PII: DOI: Reference:
S0031-3203(20)30015-7 https://doi.org/10.1016/j.patcog.2020.107209 PR 107209
To appear in:
Pattern Recognition
Received date: Revised date: Accepted date:
17 March 2019 22 August 2019 16 January 2020
Please cite this article as: Benzhi Chen, Lisheng Wang, Xiuying Wang, Jian Sun, Yijie Huang, Dagan Feng, Zongben Xu, Abnormality Detection in Retinal Image by Individualized Background Learning, Pattern Recognition (2020), doi: https://doi.org/10.1016/j.patcog.2020.107209
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2020 Published by Elsevier Ltd.
1
2
3
4
5
6
Highlights • A multi-scale sparse coding based learning algorithm is proposed for effectively learning the individualized retinal background; • A repeated learning strategy is proposed for improving the accuracy of the individualized retinal background; • A feasible approach is developed for detecting both salient and weak retinal lesions.
1
Abnormality Detection in Retinal Image by Individualized Background Learning
7
8
Benzhi Chena , Lisheng Wanga,∗, Xiuying Wangb , Jian Sunc , Yijie Huanga , Dagan Fengb , Zongben Xuc
9 10
11 12 13 14
a
Department of Automation, Shanghai Jiao Tong University, and the Key Laboratory of System Control and Information Processing, Ministry of Education, Shanghai, 200240, P. R. China b School of Information Technologies, The University of Sydney, Sydney, Australia c School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an 710049, China
15
Abstract
16
Computer-aided lesion detection (CAD) techniques, which provide potential for automatic early
17
screening of retinal pathologies, are widely studied in retinal image analysis. While many CAD
18
approaches based on lesion samples or lesion features can well detect pre-defined lesion types,
19
it remains challenging to detect various abnormal regions (namely abnormalities) from retinal
20
images. In this paper, we try to identify diverse abnormalities from a retinal test image by finely
21
learning its individualized retinal background (IRB) on which retinal lesions superimpose.
22
3150 normal retinal images are collected as the priors for IRB learning. A preprocessing
23
step is applied to all retinal images for spatial, scale and color normalization. Retinal blood
24
vessels, which have individual variations in different images, are particularly suppressed from all
25
images. A multi-scale sparse coding based learning (MSSCL) algorithm and a repeated learning
26
strategy are proposed for finely learning the IRB. By the MSSCL algorithm, a background space
27
is constructed by sparsely encoding the test image in a multi-scale manner using the dictionary
28
learned from normal retinal images, which will contain more complete IRB information than
29
any single-scale coding result. From the background space, the IRB can be well learned by
30
low-rank approximation and thus different salient lesions can be separated and detected. The
31
MSSCL algorithm will be iteratively repeated on the modified test image in which the detected
32
salient lesions are suppressed, so as to further improve the accuracy of the IRB and suppress
33
lesions in the IRB. Consequently, a high-accuracy IRB can be learned and thus both salient
34
lesions and weak lesions that have low contrasts with the background can be clearly separated.
35
The effectiveness and contributions of the proposed method are validated by experiments over ∗
Corresponding author:
[email protected] Preprint submitted to Pattern Recognition
January 16, 2020
36
different clinical data-sets and comparisons with the state-of-the-art CAD methods.
37
Keywords: Retinal abnormality detection, Retinal lesion detection, Computer-aided
38
detection, Dictionary learning, Background learning, Retinal image reading
39
1. Introduction
40
Retinal pathologies such as drusen, diabetic retinopathy and macular degeneration, either
41
primary or associated with other diseases, age-related or occurring in young children, are the
42
main cause of visual loss. With the wide availability of retinal imaging scanners, early screening
43
becomes a universal approach to significantly reduce the risk of blindness [1].
44
To reduce the workload of the medical experts and the costs incurred in the routine screen-
45
ing, in the past two decades, extensive efforts on developing computer-aided lesion detection
46
(CAD) have been devoted for more efficient and automated lesion detection from a large quan-
47
tity of medical images [2, 3]. Either trained with sample images or based on the image features
48
of specific types of lesions, the current CAD techniques are inherently designed to be disease-
49
customized. While a disease-customized CAD may provide an effective computational solution
50
to detect a predefined type of lesion from medical images, it often lacks the capacity to detect
51
different types of lesions from the same medical image. From a clinical-diagnosis perspective,
52
this is a serious flaw, and CAD techniques for detecting various types of lesions are needed. In
53
this paper, we develop an effective technique to detect various types of abnormal regions from
54
retinal images.
55
Automated detection of a diverse range of abnormal regions from retinal images is generally
56
a challenging research initiative, due to the major reasons we outline below. As illustrated in
57
Fig. 1, (1) different types of retinal lesions might be contained in a single retinal image, and
58
they often display substantial variability in terms of shape, size, color, textures and locations;
59
(2) retinal blood vessel structures may exhibit significant differences across different individuals,
60
and there are also individual differences among the health areas of similar location in different
61
images; (3) images with similar features albeit in different regions can have different evaluations
62
of abnormality; and (4) the number and the type of retinal lesions in a retinal image are generally
63
unknown in advance, and different retinal images often have different imaging qualities. These
64
factors pose significant challenges for a computational approach to detect and differentiate the 3
Figure 1: A retinal image may contain multiple types of lesions ((a)-(e)), and different retinal images may contain different types of lesions ((f)-(j)). Different lesions can exhibit diverse properties in shapes, sizes, colors, and positions.
65
diverse abnormal regions from retinal images.
66
Experienced ophthalmologists, however, can visually detect various abnormal regions from
67
retinal images even if they have a variety of different imaging features. In particular, they can
68
detect types of abnormalities never seen before. This human intelligence has been built upon
69
years of training and ample prior knowledge. In the training procedure, they observe a large
70
number of normal retinal images and learn their characteristics. When reading a retinal image,
71
ophthalmologists implicitly compare the case with their prior knowledge of any similar normal
72
cases, and accordingly make a judgement. Inspired by such observations, we collected 3150
73
normal retinal images (denoted by Ψ) to serve as the prior knowledge of normal retinal images.
74
We will detect various retinal abnormalities from retinal images based on such priors rather
75
than lesion samples or lesion features.
76
Even between normal retinal images, significant individual differences can exist, such as
77
blood vessel structure. Additionally, different imaging conditions, image colors, image sizes and
78
spatial positions of anatomical structures can contribute to large differences among different
79
retinal images, as shown in Fig. 2. In order to suppress bias caused by these factors, a
80
preprocessing step is applied to all retinal images, normalizing image scales, image colors and
81
spatial positions of anatomical structures. In particular, retinal blood vessels, which usually
82
have large individual variations in different retinal images, are suppressed from all images,
83
and their regions are smoothly filled based on the colors of the surrounding region. In such a
4
Figure 2: Normal retinal images from different persons. They may have different colors, scales and are usually not aligned spatially. Blood vessels in them may also have great individual differences. 84
case, a retinal test image with abnormal regions can be regarded as the superposition of two
85
different parts: abnormal regions and the normal retinal background, as illustrated in Fig. 3.
86
Therefore, for the given retinal test image, retinal abnormality detection can be converted into
87
the following problem: how to compute its corresponding normal retinal background? Once we
88
have the retinal background of a retinal test image, then various retinal abnormal regions can
89
be detected and separated from the image. In this paper, for each given retinal test image, we
90
learn its high-accuracy individualized retinal background (IRB), to detect both salient lesions
91
and weak lesions that have low contrasts with the IRB from the test image.
Figure 3: After removing blood vessels, a retinal image with lesions (i.e. (b)) can be regarded as the superposition of lesions (i.e. (d)) on its retinal background (i.e. (c)). (a) the original retinal image. (b) the retinal image without blood vessels. (c) the retinal background. (d) lesion regions.
92
We develop a novel computational approach to learn the IRB from the retinal test image
93
and Ψ, in which a multi-scale sparse coding based learning (MSSCL) algorithm and a repeated
94
learning strategy are proposed for the fine learning of the IRB. The MSSCL algorithm con-
95
sists of two steps. First, a dictionary is learned from normal retinal images Ψ and the test 5
96
image is sparsely encoded in a multi-scale way. Here, any single-scale coding result is only a
97
rough approximation of the IRB, either containing no lesions but having a low approximation
98
accuracy, or having a high accuracy but containing many lesions. Multi-scale coded results,
99
however, together construct a background space that will contain more complete information
100
for IRB learning. Second, the IRB is well learned from the background space by low-rank
101
approximation, and thus different salient lesions are separated and detected from the retinal
102
test image. Some images among the multi-scale coded results, however, may encode some infor-
103
mation of salient lesions, and these lesions will be reflected in the IRB learned by the MSSCL
104
algorithm. Thus, the repeated learning strategy is further applied, by which the MSSCL algo-
105
rithm is iteratively repeated on the modified test image in which the detected salient lesions
106
are suppressed, so as to further improve the accuracy of the IRB and suppress lesions in the
107
IRB. Consequently, a high-accuracy IRB containing almost no lesions can be learned and thus
108
both salient lesions and weak lesions can be well separated from the retinal test image. The
109
effectiveness and advantages of the proposed method are validated by many experiments and
110
comparisons with the state-of-the-art CAD methods, such as those in [4, 5, 6, 7, 8, 9].
111
The main contributions in this paper are as follows: (i) The MSSCL algorithm is proposed
112
for effectively learning the IRB. (ii) A repeated learning strategy is proposed for improving the
113
accuracy of the IRB. (iii) A feasible approach is developed for detecting both salient and weak
114
retinal lesions.
115
2. Related works
116
The existing CAD techniques for retinal lesion detection can be mainly divided into either
117
feature-based or training-based methods. For the former, predefined types of lesions are de-
118
tected from retinal images by handcrafted features [10, 11, 12, 13, 14]. For instance, peaks of
119
directional cross-section profiles centered on local maximum pixels of the image, the statistical
120
measures of size, and height and shape of the peaks, are used as the feature set of a Bayesian
121
classifier for detecting retinal microaneurysms in [10]. In [11], five handcrafted features in-
122
cluding holistic texture features and the local retinal features are specifically applied to the
123
predefined types of lesions. For the latter, samples are often collected for the specific type of
6
124
retinal lesions, and then the lesion features are learned from these samples for detecting the
125
specific type of lesions from retinal images [5, 15, 16]. These methods are basically customized
126
for a predefined type of lesion and thereby may lack the compatibility or capacity for detecting
127
other types of retinal lesions. In particular, they cannot detect the retinal lesions of unseen
128
type. Furthermore, different types of retinal lesions have different shapes, sizes, colors, textures
129
and positions. Thus, it is difficult to learn the common features of all types of retinal lesions. In
130
addition, it is difficult to collect sufficient training data with expert-labelled lesion regions for
131
all possible lesion types. Thus, the CAD techniques based on lesion samples or lesion features
132
are facing challenges for detecting various abnormal regions from retinal images. This motivates
133
researchers to develop new methods for retinal abnormality detection, without utilizing retinal
134
lesion samples or lesion features.
135
More recently, multiple lesion detection based on normal images has been investigated in
136
[4, 7, 8, 9, 17, 18, 19, 20, 21]. In [7, 17], an atlas or an average retinal model is computed
137
from normal images, and abnormalities are detected by comparing each test image with the
138
atlas or the average model. The fixed atlas or average model, however, usually cannot adapt
139
to individual differences in different normal images, and hence, weak lesions cannot be easily
140
distinguished from normal individual differences. In [18, 21], global features are learned from
141
normal medical images, which are then used to discriminate normal images from abnormal ones
142
with specific types of lesions. However, such features are not suitable for detecting all types of
143
lesions. In [19], a large database of normal chest radiographs is collected, and lung nodules are
144
identified by subtracting a similar image founded in the database from the target image. Since
145
this method is based on a huge database with approximately 15000 normal chest radiographs,
146
its performance may vary when working with a much smaller database. In [20], a hyper-volume
147
is generated from a set of normal MRI brain images, and abnormalities are detected by mapping
148
the image into the hyper-volume. While this method can be used to detect multiple lesions
149
from MRI brain images, its detection accuracy has a dice value of less than 0.67, which needs
150
to be further improved. In [4], multiple sclerosis lesions are segmented from brain MRI images
151
based on normal brain MRIs. In this method, all normal brain MRI images are divided into
152
different small patches (3*3*3) to form a training set, from which a dictionary is learned and
7
153
used to reconstruct each patch of the test image. The regions with large reconstruction errors
154
are then marked as lesions. This method, however, may face challenges for retinal images
155
because many retinal abnormal regions can also be reconstructed with small errors. In [8], the
156
retinal background is reconstructed by orthogonal basis vectors learned from normal retinal
157
images rather than their small patches, and abnormal regions are marked by the reconstruction
158
residual. However, it is challenging for this method to detect both large salient retinal lesions
159
and weak lesions. The method in [9] detects retinal abnormality in a weakly-supervised manner,
160
but the learned retinal background is too smooth to contain normal individual differences. The
161
current methods on background learning are facing challenges on either distinguishing weak
162
lesions from many false positives or losing salient lesions.
163
In comparison to the aforementioned methods based on lesion samples and lesion features,
164
our proposed method is able to more effectively detect various abnormal regions including
165
unknown types from retinal images. Comparing with the methods for multiple lesion detection,
166
this proposed method has the capacity to learn a highly accurate IRB that rarely contains any
167
lesions form the test image. Our method is capable of detecting both salient and weak lesions
168
from the retinal test image.
Figure 4: The flowchart of the preprocessing step. (a)-(b) input the reference image and a retinal image. (c) normalization in scale and spatial position. (d) normalization in color. (e) removing blood vessels and filling their regions with surrounding colors.
169
3. Preprocessing of retinal images
170
Retinal images in Ψ can not be directly used for retinal background learning. A preprocess-
171
ing will be first applied to all retinal images. Let F denote a retinal test image with various
172
abnormal regions. For all retinal images in Ψ ∪ {F }, their imaging bias due to different imaging
173
conditions will be suppressed by normalizing them in scales, colors and spatial positions, and 8
174
the blood vessels will be removed from all the images and their regions are smoothly filled-up
175
by around colors. The flowchart of the preprocessing is illustrated in Fig. 4.
176
Spatial alignment - The spatial alignment enables us to compare similar anatomical regions
177
in all different retinal images. Each retinal image has two centers: the center of the optic
178
disk and the center of the fovea [22][23], as illustrated in Fig. 3(a). The two centers can be
179
automatically detected from retinal images by the U-net deep network [6]. In this paper, all
180
retinal images in Ψ∪{F } are aligned spatially according to these two centers. In the meanwhile,
181
the image scales are also normalized.
182
Color normalization - The normalization in image colors and scales ensures that we can
183
process retinal images taken under different imaging conditions. The colors of all retinal im-
184
ages are normalized by the following formula (1)[24], and thus these images have similar color
185
distributions as a reference retinal image with the mean µ1 and the variance σ1 :
Fnew = 186
187
σ1 (F2 − µ2 ) + µ1 σ2
(1)
where, F2 is a fundus image with the mean µ2 and the variance σ2 , and F1 is its normalized image.
188
Blood vessels removal - Experienced doctors know anatomical structures in the image, so
189
they can easily ignore individual differences of the same anatomical structure in different per-
190
sons. The blood vessels removal can reach such an effect. Individual differences of blood vessels
191
are suppressed by firstly detecting and removing blood vessels from all retinal images and then
192
filling blood vessel regions with the neighboring values. Here, blood vessels are detected by the
193
methods in [6] [25], and blood vessel regions are filled by the method in [26].
194
195
To some extent, the preprocessing above mimics the normalization operations implicitly performed in the human vision system.
196
After the above preprocessing, the test image F becomes FBL , that is the superposition of
197
various abnormal regions (denoted by FL ) on the retinal background of F (denoted by FB ).
198
FB is just the IRB we try to compute. Meanwhile, each normal retinal image in Ψ becomes its
199
retinal background. Therefore, Ψ becomes the set ΨB that includes retinal backgrounds of all
200
normal images in Ψ. FBL of some abnormal retinal images are shown in Fig. 5(a)-(d), and some 9
Figure 5: Pre-processing results of some retinal images. (a)-(d) are FBL of four abnormal images in Figs. 1(a)-(d), respectively. (e)-(h) are background images of four different normal images in Ψ.
201
typical background images in ΨB are shown in Fig. 5(e)-(h). FL will be separated from FBL
202
by learning the IRB FB from FBL based on the prior ΨB . For the convenience of discussion,
203
all images in ΨB ∪ {FBL } are converted into gray-scale images by extracting the green channel
204
of each color image. In retinal images, the green channel usually provides a better contrast
205
between retinal lesions and background regions, as shown in [9].
206
4. A computational approach for learning the IRB
207
The IRB FB can be regarded as an image that satisfies the following conditions:
208
(i) FBL = FL ⊕ FB , ⊕ means the superposition of FL onto FB .
209
(ii) ΨB are some priors of FB . In other words, FB is similar as images in ΨB , but has
210
individual differences.
211
This means that, the IRB actually includes two kinds of normal regions: the normal regions
212
originally belonging to the test image, and the normal regions that are intended for replacing
213
lesion regions of the test image. In this section, a computational approach is proposed for
214
learning the IRB FB (i.e., its two kinds of normal regions) from FBL and the prior ΨB . In the
215
approach, a multi-scale sparse coding based learning (MSSCL) algorithm is first proposed to
216
learn the IRB. Then, a repeated learning strategy is applied to improve the accuracy of the
217
IRB and to suppress lesions in the IRB. As a result, a high-accuracy IRB can be learned. The
218
flowchart of this computational approach is shown in Fig. 6. 10
Figure 6: The flowchart of the proposed computational approach. First, the training set ΨB and the test image FBL are input. Second, a dictionary D is learned from ΨB , and a rough background space Θrough is constructed by sparsely coding FBL with D in a multi-scale way. Third, the initial IRB FU and the sparse image FV (in which large salient lesions are highlighted) are learned from FBL based on its low-rank decomposition over ΘRough . Fourth, FBL is modified as FBL1 by suppressing known large salient lesions from it, and FBL1 is sparsely coding in a multi-scale way again. Consequently, a refined background space ΘF ine is constructed. Finally, a fine IRB FP and the refined sparse image FQ (in which various abnormal lesions are highlighted) are learned from FBL based on its low-rank decomposition over ΘF ine . Here, the second and third steps constitute the MSSCL algorithm, and the last two steps constitute the repeated learning strategy.
219
4.1. The MSSCL algorithm for learning the IRB
220
4.1.1. Sparse coding of the test image and its encoding properties
221
In this section, a dictionary is learned from the set ΨB , and sparse coding properties of FBL
222
by the dictionary are discussed. All retinal backgrounds in ΨB are vectorized and normalized as
223
unit column vectors, and are concatenated together to form a low-rank matrix M . The dictio-
224
nary D = (d1 , d2 , . . . , dk ) can be learned from M by solving the following matrix factorization
225
problem [27][28]: n
min
D∈C,αi ∈Rk
1X1 kxi − Dαi k22 + βkαi k1 n i=1 2
(2)
226
where, dj is an atom in the dictionary D, k is the number of atoms, C = {D ∈ Rm×k , s.t.kdj k22 ≤
227
1}, β denotes a sparse regulation parameter, αi is a sparse coefficient vector, and xi is a column
228
vector in M . In this paper, a much small β and a large k (β = 0.001 and k = 300) are set so
229
as to learn a dictionary with a strong representation ability [29, 30].
230
After having learned the dictionary D from ΨB , the sparse coding properties of FBL by the
231
dictionary will be studied. Here, the sparse coding of FBL refers to the sparse reconstruction 11
232
of FBL by the sparse combination of atoms in D. The sparse coding coefficients can be derived
233
by optimizing [29]: 1 min kFBL − Dαk22 + λkαk1 k α∈R 2
(3)
234
where, λ is the sparsity regulation parameter, and α the optimal sparse coding coefficient vector.
235
Suppose that the optimal sparse coding coefficients are α = w1 , w2 , w3 , . . . , wk , then FBL can
236
be sparsely coded (or approximated) by the following equation:
FBH =
k X i=1
237
w i · di
(4)
where di is the i-th atom in D, i = 1, . . . , k.
238
The sparsity regulation parameter λ can control the encoded contents of FBL in FBH , and
239
its different settings determine variant rough backgrounds FBH of FB . There are the following
240
properties:
241
242
(i) When λ selects a large value, many image details including lesion regions are suppressed from FBH , but FBH have a low approximation accuracy to the IRB.
243
(ii) When λ selects a small value, FBH have a high approximation accuracy to the IRB
244
and contain many image details of FBL , but many lesion regions will also appear in FBH .
245
Particularly, large salient lesions can have severe diffusion in FBH .
246
(iii) With the decrease of λ, while the normal regions originally belonging to FBL are re-
247
constructed in FBH with the gradually-increasing accuracy, lesion regions will gradually appear
248
in FBH ; with the increase of λ, while lesion regions are gradually suppressed in FBH , the nor-
249
mal regions originally belonging to FBL are reconstructed in FBH with the gradually-increasing
250
error.
251
These encoding properties are demonstrated in Fig. 7. Thus, for any given λ, the corre-
252
sponding FBH cannot well represent the IRB FB , and is only a rough background. All different
253
rough backgrounds, i.e., the set Θ = {FBH : λ ∈ (0, ∞)}, however, together provide a comple-
254
mentary and complete information for the IRB learning. For example, in Θ, the information
255
for learning the two kinds of normal regions of the IRB are implicitly contained in different
256
rough backgrounds (with small λ and with large λ), respectively. Therefore, Θ will be used as
12
Figure 7: Sparse coding results of FBL by different λ. (a) two different FBL . (b)-(d) all lesions can be suppressed in FBH when λ selects large values. (e)-(h) large salient lesions can gradually diffuse into large regions (in red circles) when λ selects small values.
257
the background space for learning the IRB, from which the two kinds of normal regions of the
258
IRB will be learned.
259
4.1.2. IRB learning with the MSSCL algorithm
260
In this section, we introduce how the IRB will be learned from the background space Θ.
261
First, a discrete approximation of Θ (i.e., a discrete background space) will be selected for the
262
computation. For the purpose, we need to select a series of specific values ε1 > ε2 > ......> εm
263
for λ so that their corresponding FBH construct a suitable discrete background space. We note
264
that, when εi (i = 1, 2, ..., m) select different values, the corresponding discrete background
265
spaces might represent different priors. For example,
266
(i) If most of εi , i = 1, 2, ..., m select large values, then the discrete background space will
267
be formed by many low-accuracy rough backgrounds. From which, normal regions originally
268
belonging to the test image cannot be learned in a high accuracy.
269
(ii) If most of εi , i = 1, 2, ..., m take small values, then the discrete background space will be
270
formed by many high-accuracy rough backgrounds in which salient lesions cannot be suppressed
271
and they will have large diffusions. From which, the normal regions intended for replacing
272
lesion regions can not be well learned, and the normal regions around salient lesions might be
273
incorrectly detected as abnormal ones.
274
275
Thus, in this paper, we particularly select discrete values for εi according to the following equation:
εi =
1 2i−2
, i = 1, 2, ..., m
13
(5)
276
By selecting m (in this paper, m = 8) different values in the equation (5) for λ, m different
277
rough backgrounds (denoted by B1 , B2 ,......, Bm , respectively) are sparsely reconstructed from
278
FBL . The values selected in the equation (5) have the following merits: (i) They ensure that B1 ,
279
B2 ,......, Bm are sparely coded in a multi-scale way, providing a compact and efficient discrete
280
representation of Θ. (ii) By suitably covering both large and small values, B1 , B2 ,......,Bm not
281
only make a good balance between suppressing lesion regions and learning high accuracy normal
282
regions, but also provide efficient information for learning the two kinds of normal regions of
283
the IRB.
284
Usually, intensity differences exist between the same normal regions in Bi and FBL . Thus,
285
we particularly compute n (in this paper, n = 6) additional images, denoted by the set A =
286
{A1 , A2 , ......, An ), by linearly interpolating FBL and a specific Bi with few lesions (heuristically,
287
Bi corresponding to λ =
288
1, 2, ..., n} are used to implicitly form a background space from which the IRB will be learned.
289
The interpolation operation above ensures that the intensity of normal regions will change
290
smoothly among the image series formed by all Bi , Aj and FBL [31].
1 4
is applied). The n + m images Θrough = {Bi , Aj : i = 1, 2, ..., m, j =
291
The m + n + 1 images Bi , Aj , FBL are vectorized and are concatenated together to form a
292
low-rank matrix ML . While salient lesions in FBL correspond to sparse structures in ML , the
293
IRB can be approximately regarded as the low-rank structure in ML . ML can be modelled as
294
the sum of the two matrices: ML = U +V , where U is the low-rank background component, and
295
V is the sparse component. U and V can be computed by solving the following optimization
296
problem:
arg min rank(U ) + α · kV k1 , s.t. ML = U + V U,V
(6)
297
where α is the sparse regularization parameter. Robust principal component analysis (RPCA)
298
can be used to compute approximately U and V from the equation (6) [32, 33]. RPCA takes the
299
normal regions with smooth transition in different images as the background components. As
300
a result of sparse decomposition, FBL is divided into two images: the background component
301
image FU and the sparse image FV .
302
Essentially, FU is learned from Θrough by low-rank approximation namely, the two kinds of 14
303
normal regions of the IRB are implicitly learned from their corresponding regions in Θrough by
304
the low-rank computation. Thus, normal regions in FBL can be well reconstructed in FU due
305
to their similarity to their corresponding regions in Θrough , and the normal regions intended for
306
replacing lesion regions are reconstructed in FU by the specific average of their corresponding
307
regions in Θrough . At the same time, salient lesions and many other lesions in FBL are separated
308
into and highlighted in FV and can be easily detected.
309
However, large salient lesions in FBL can be encoded in and have diffusions in some rough
310
backgrounds Bi with small λ, as shown in Fig. 6. As a result, FU can encode certain information
311
of salient lesions. Thus, some false positives, which are due to the diffusion effects of salient
312
lesions and individual differences of some normal regions, will exist in FV . Usually, weak
313
lesions cannot be well distinguished from them in FV . Thus, a further refinement on the set
314
B1 , B2 , ......, Bm is needed, so as to exclude encoded information of salient lesions in them.
315
4.2. Repeated learning strategy for improving the IRB
316
In this section, FBL is modified as FBL1 by suppressing detected salient lesions from FBL ,
317
and the IRB will be repeatedly learned from the FBL1 by the MSSCL algorithm. By such a
318
repeated learning strategy, the accuracy of the IRB can be further improved and lesions can
319
also be well suppressed from it.
320
Salient lesions removal from FBL – Salient lesions and some other lesion regions can be firstly
321
segmented from FV by the threshold operation with a threshold value T , and then followed by
322
hole-filling. Here, T can select any value from the range [12, 26], as shown in the heuristic study
323
of Fig. 16(b). Pixels whose values are larger than T will be marked as salient lesions. For the
324
large regions, we fill their blank regions with the mean intensity value of their corresponding
325
regions in all background images in ΨB . For the small regions, their blank areas are filled by the
326
intensity values around the regions using the algorithm in [34]. By such processing, a modified
327
FBL (i.e., FBL1 ) without salient lesions is generated, as shown in Fig. 8(b).
328
Repeated learning of the IRB – the rough backgrounds B1 , B2 ,......, Bm are updated by
329
learning them from FBL1 , and correspondingly, the additional image set A = {A1 , A2 , ......, An }
330
is recalculated. Here, the set B = {B1 , B2 , ......, Bm } can well exclude encoded information of
331
large salient lesions. The n + m images ΘF ine = {Bi , Aj : i = 1, 2, ..., m, j = 1, 2, ..., n} now 15
332
implicitly form a refined background space differing from Θrough , which will be used as the
333
discrete background space for learning a fine IRB from FBL . All Bi , Aj and FBL are vectorized
334
and concatenated to form a low-rank matrix (denoted by MBF ). Various abnormal regions in
335
FBL , such as salient lesions, weak lesions and other abnormal regions, all are sparse structures
336
in MBF .
Figure 8: (a) FBL . (b) FBL1 , in which salient lesion regions are suppressed. (c) FQ , in which various abnormal regions are well highlighted.
337
MBF can be decomposed into a low-rank background component and a sparse component by
338
the RPCA. Accordingly, FBL is now divided into two new images: the background component
339
image FP and the sparse image FQ . FP and FQ are updated versions of FU and FV . Comparing
340
with FU , FP not only contains less lesion regions but also approximates normal regions with
341
a higher accuracy. FP is a fine IRB, and FQ highlights various abnormal regions in FBL , as
342
shown in Fig. 8(c).
343
5. Results and Comparisons
344
5.1. Data set and Evaluation Methods
345
The proposed method is quantitatively evaluated using two different data sets. The first
346
data set (denoted by H) consists of 190 retinal images with manually labeled various lesions.
347
Here, 190 retinal images are originally from an open data set in Kaggle. In H, 17420 lesion
348
regions and more than 26 different types of lesions (such as choroidal old lesions, macular degen-
349
eration, drusen, crystalline retinopathy, acute retinal necrosis, vitreous degeneration, choroidal
350
inflammation, retinal arterial macroaneurysms, choroidal pigment, choroidal pigmented mole,
351
panretinal photocoagulation spot and unfamiliar lesions) are contained. While retinal images 16
Figure 9: Statistical distributions of different properties of retinal lesions in the data set H. (a) Size of different lesions. (b) Number of lesions in different images. (c) Gray values of different lesions. (d) Contrasts of different lesions with regard to their surrounding regions.
352
in H contain different types of lesions exhibiting diverse visual properties (as shown in Figs. 1,
353
11(a)-(r), 12(a)-(c)), the number of lesions in different retinal images varies from 1 to 585, the
354
size of lesions changes from 2 pixels to 1831 pixels, and both salient lesions and weak lesions
355
are contained, as illustrated in Figs. 9(a)(b)(c)(d), respectively. Hence, retinal images in H are
356
very representative, and they can be used to justify the generalization of the proposed method.
357
The second data set (denoted by MM) comprises manual labels of 110 retinal images with
358
multiple types of lesions. Here, 110 retinal images are originally from a large open data set
359
Messidor. In the data set MM, more than 7500 lesion regions and six different types of lesions
360
are contained. However, most of the lesions in the MM data set are diabetic-related lesions
361
and have small areas. Let T P , F P , T N , F N represent true positive, false positive, true negative and false neg-
17
ative, respectively. Then, sensitivity (SE), specificity (SP), precision (PR), recall (RC), true positive rate (TPR), false positive rate (FPR) and dice score (DSC) are computed as follows, respectively:
SE =
TN TP 2T P TP , SP = ,PR = , DSC = TP + FN TN + FP TP + FP 2T P + F P + F N
(7)
T P R = RC = SE, F P R = 1 − SP 362
Because of the partial volume effects in retinal images, the outer boundaries of many lesion
363
regions are fuzzy and cannot be well determined. They might be lost in the detection or even
364
in the manual labeling. To compensate for this defect, we will compute T P , F P , T N , F N as
365
those in [35]. In [35], if a detected pixel is next to a manually labeled pixel, it will be considered
366
as a true positive. Thus, manually labeled lesion regions are expanded spatially by the radius
367
of two pixels, and new pixels are still considered as true positives. Meanwhile, the identified
368
abnormal regions are also expanded spatially with the radius of two pixels, and new pixels are
369
still considered as abnormal. T P , F P , T N , F N are computed based on the expanded regions.
Figure 10: (a)-(b) ROC curve and PR curve of the proposed method. They are computed based on retinal abnormality detection results in the data sets H and MM.
370
5.2. Results
371
In experiments, the three parameters m, n and the threshold T select the following constant
372
values, respectively: m = 8, n = 6 and T = 20. However, Section 6.3 will show that their
373
selection has the robustness. 18
Figure 11: (a)-(r) different retinal images containing various lesions, where different lesions may have different shapes, sizes, colors, textures and positions. (a1)-(r1) retinal abnormality detection results from the different retinal images in (a)-(r) by the proposed method.
374
1) Quantitative evaluation by the ground truth: Based on the data sets H and MM, the
375
proposed method is quantitatively evaluated. While more than 26 different types of lesions
376
are identified from H, 6 different types of lesions are identified from MM. Additionally, the
377
ROC curve and precision-recall (PR) curve for retinal abnormality detection are computed
378
respectively over H and MM, and shown in Fig. 10. Fig. 10 shows that, over the both data
379
sets, the proposed method achieves high AUC (the area under the ROC curve) of 0.9921 and
380
0.9971, and high MAP (the area under the PR curve) of 0.8694 and 0.8688. The quantitative
381
analysis in Fig. 10 shows that the proposed method is effective for detecting a variety of lesions
382
from retinal images, even if the types and the number of lesions in these images are unknown 19
383
and different lesions exhibit diverse visual features. Dice scores are also computed over H and
384
MM for evaluating retinal abnormality segmentation, which are 0.80 and 0.81, respectively.
385
Retinal lesions usually have small areas, and thus their small segmentation errors will lead to
386
large variations of dice scores.
387
2) Some typical examples: The proposed method is applied to various retinal images to
388
test its effectiveness and feasibility. Since abnormal regions in retinal images are complex,
389
experimental results are provided in three different cases, respectively.
390
Different retinal images may contain different types of lesions, and different lesions usually
391
have distinct visual features (e.g., shapes, sizes, colors, textures and positions), as shown in Figs.
392
1 and 11(a)-(r). By the proposed method, however, different types of lesions can be well detected
393
from different retinal images, as illustrated in Figs. 11(a1)-(r1). In particular, even if some
394
retinal images contain a large number of small lesions (see Figs. 11(d)(g)(h)(m)(r)), these small
395
lesions can also be well detected from retinal images, as shown in Figs. 11(d1)(g1)(h1)(m1)(r1).
396
Some retinal images may contain both large and small lesions, as shown in Figs. 11(b)(f)(h)(i)
397
(j)(n)(p)(r). By the proposed method, both of them can be well detected although they have
398
significantly different sizes, as illustrated in Figs. 11(b1)(f1)(h1)(i1)(j1)(n1)(p1)(r1).
399
A retinal image may contain multiple types of lesions with different visual features, as
400
shown in Figs. 12(a)-(c). However, the types and the number of lesions are usually unknown in
401
advance. By the proposed method, different types of lesions can be well detected from a retinal
402
image although these lesions have significantly different visual features, as illustrated in Figs.
403
12(a1)-(c1)). In Figs. 12(a1)-(c1), 6 different types of lesions are detected from each retinal
404
image.
405
5.3. Comparisons
406
5.3.1. Quantitative comparisons with related methods
407
Based on two data sets H and MM, the proposed method is quantitatively compared with
408
two popular CAD methods for retinal lesion detection, and four methods that can be applied
409
to detect retinal abnormality. These six methods include:
410
The U-net deep learning network in [6] (UNET2015) and fast convolutional neural network
411
in [5] (TMI2016), which are trained and then tested over each data set with the 5-fold cross 20
Figure 12: (a)-(c) three retinal images with 6 types of different lesions (dot haemorrhages, block haemorrhages, microaneurysms, dot exudates, block exudates, cotton wool spots). (a1)-(c1) retinal abnormality detection results from each retinal image by the proposed method.
412
validation. The method in [4] (MICCAI2013), which are used to reconstruct patches of F
413
by the dictionary learned from small patches of Ψ and detect retinal lesions by reconstruction
414
errors. The average model based method in [7] (CMIG2013), which can be used to detect retinal
415
lesions by computing the residual between FBL and the average model computed from ΨB . The
416
method in [8] (NEUCOM2018), which reconstructs the retinal background by orthogonal basis
417
vectors learned from ΨB by the PCA analysis. The method in [9] (TMI2019), which computes
418
the retinal background by weakly supervised learning.
419
The seven methods are evaluated quantitatively by two data sets H and MM, and their
420
ROC curves, precision-recall curves, AUC values and MAP values are shown in Fig. 13 and Fig.
421
14, respectively. Figs. 13 demonstrates that, the proposed method significantly outperforms
422
the other six methods when the seven different methods are used to detect various lesions from
423
retinal images in H. Fig. 14 illustrates that, over the data MM, the proposed method also
424
has better performance than the five methods in [4, 5, 6, 7, 8]. The proposed method and
425
the method in [9] show their respective advantages on different sets. However, the proposed
426
method has stable performance over two different sets H and MM, and can perform better 21
Figure 13: (a)-(b) ROC curves and PR curves of the 7 different methods (Note, MICCAI2013 [4] has 5 different cases in which patch sizes are different). They are generated based on detection results of various abnormal regions from the same data set H. (c) AUC and MAP values of the 7 different methods, computed from ROC and PR curves in (a)-(b). The 6 MAP values are sorted as follows: 0.8694 (our algorithm), 0.8394 (TMI2019 [9]), 0.8221 (UNET2015 [6]), 0.6519(NEUCOM2018 [8]), 0.6216 (TMI2016 [5]), 0.4295(CMIG2013 [7]), 0.1264, 0.1267, 0.1358, 0.1277, 0.1365 (MICCAI2013 [4]), respectively.
427
than the TMI2019 method in more complex data set, such as H.
428
The method in [7] actually provides the same retinal background for different retinal test
429
images. In [4, 8], orthogonal basis vectors or the dictionary are learned from ΨB or their small
430
patches, and the encoded result of the test image, which is similar as FBH , is approximately
431
regarded as the IRB. In [9], both lesions and normal individual differences can be regarded as
432
abnormal regions due to their random characteristics, and thus the learned IRB might be too
433
smoothed. In this paper, however, the IRB is finely learned, and it can have a high accuracy
434
and contain almost no lesions.
435
5.3.2. Quantitative comparisons with background image learning methods
436
After the preprocessing in this paper, some background image learning methods developed
437
for video image analysis or time series image analysis might also be used for retinal background
438
learning from ΨB . Thus, the proposed approach is particularly compared with three recently
439
developed background image learning methods in [36, 37, 38] (ICML2014 [36], AAAI2013 [37],
440
ICCV2015 [38]). Based on the data set H, we quantitatively evaluate the four methods, and 22
Figure 14: (a)-(b) ROC curves and PR curves of the 7 different methods (Note, MICCAI2013 [4] has 5 different cases in which patch sizes are different). They are generated based on detection results of various abnormal regions from the same data set MM. (c) AUC and MAP values of the 6 different methods, computed from ROC and PR curves in (a)-(b). The 6 MAP values are sorted as follows: 0.9091 (TMI2019 [9]), 0.8688 (our algorithm), 0.8676 (UNET2015 [6]), 0.8459(NEUCOM2018 [8]), 0.6947 (TMI2016 [5]), 0.4669(CMIG2013 [7]), 0.0831, 0.0683, 0.0786, 0.0727, 0.0855 (MICCAI2013 [4]), respectively.
441
their ROC curves, precision-recall curves, AUC values and MAP values are shown in Fig. 15,
442
respectively. Fig. 15 shows that, the proposed algorithm has better performance in retinal
443
background learning than the other three methods. Table 1: Means and standard deviations of 10 different methods over different subsets of the data set H.
TMI2019 [9] UNET2015 [6] NEUCOM2018 [8] TMI2016 [5] CMIG2013 [7] MICCAI2013(4*4) [4] ICML2014 [36] AAAI2013 [37] ICCV2015 [38] Our algorithm
Mean AUC 0.9912 0.9874 0.9794 0.9172 0.8626 0.8140 0.9896 0.9867 0.9390 0.9918
Std AUC 0.0037 0.0041 0.0039 0.0185 0.0385 0.0112 0.0039 0.0042 0.0085 0.0028
Mean MAP 0.8429 0.8237 0.6241 0.6291 0.4348 0.1357 0.8260 0.7977 0.2903 0.8711
444
445
5.3.3. Quantitative comparisons of robustness over subsets of H 23
Std MAP 0.0449 0.0478 0.0383 0.0796 0.0789 0.0300 0.0508 0.0459 0.0216 0.0155
Figure 15: (a)-(b) ROC curves and PR curves of the four different methods. They are generated based on detection results of various abnormal regions from the same data set H. The four MAP values are sorted as follows: 0.8694 (our algorithm), 0.8379(ICML2014 [36]),0.8068 (AAAI2013 [37]),0.2827 (ICCV2015 [38]), respectively.
446
We also evaluate detection abilities and robustness of the above 10 different methods over
447
different subsets of the data set H. A 5-fold cross validation is performed for the two deep
448
learning methods, UNET2015 and TMI2016. For the other eight methods (including the pro-
449
posed approach, TMI2019, NEUCOM2018, CMIG2013, MICCAI2013, ICML2014, AAAI2013,
450
and ICCV2015), the data set H is divided into five test subsets, and the AUC and MAP values
451
on each test subset are calculated, respectively. Consequently, the standard deviations of the
452
AUC and MAP for 10 different methods over the five different test subsets can be computed.
453
They are shown in Table 1. Table 1 shows that, comparing with other 9 different methods,
454
the proposed approach not only have a high MAP mean value, but also is robust over different
455
subsets of H.
456
6. Discussions
457
6.1. On the preprocessing
458
The preprocessing applied to all retinal images not only suppresses retinal imaging bias and
459
individual differences of blood vessels, but also enables to compare similar anatomical regions
460
in all different retinal images. The importance of the preprocessing has been quantitatively
461
evaluated on the data set H. Fig. 16(c) shows that, based on the removal of blood vessels,
462
the MAP value of the proposed approach for learning the IRB can be enhanced from 0.6025 to
463
0.8694. Fig. 13(b)(MICCAI2013) and Fig. 16(a)(with different λ cases) show that, by using 24
464
the preprocessing in this paper and learning the dictionary from images in ΨB rather than
465
their small patches, the reconstruction-error based technique can enhance its MAP value from
466
0.1358 to 0.7023. Hence, the preprocessing plays a necessary and important role to enhance
467
retinal abnormality detection.
Figure 16: (a) ROC and Precision-Recall (PR) curves in different cases. Our algorithm: by the MSSCL algorithm plus the repeated learning strategy; Single Series: by the MSSCL algorithm only; λ = i: by FBH with λ = i. (b) ROC and PR curves corresponding to different thresholds T . (c) ROC and PR curves in three different preprocessing cases: without color normalization (CN) and blood vessel suppression (BVS), with only CN, and with both CN and BVS.
468
6.2. On the MSSCL algorithm and the repeated learning strategy
469
FBH , FU and FP all can be regarded as approximations of the IRB. In what follows, based on
470
quantitative evaluations on the data set H, we show their differences, and explain the necessity
471
and importance of applying the MSSCL algorithm and the repeated learning strategy.
472
For any single λ, the corresponding FBH represents only one rough background. It either
473
has a low approximation accuracy, or has a high accuracy but contains many lesion regions.
474
Thus, diverse retinal abnormalities can not be well detected by the residual (or reconstruction
475
error) |FBH − FBL | no matter what value is selected for λ. This is illustrated in Fig. 16(a).
476
1 1 1 Fig. 16(a) shows that, no matter what value is selected for λ (such as 2, 1, 12 , 14 , 81 , 16 , 32 , 64 ),
477
the corresponding precision-recall curves of the residuals |FBH − FBL | all have comparatively 25
478
small MAP values, being smaller than 0.7023.
479
The multi-scale sparse coding results of FBL , i.e., Θrough , however, together can provide
480
complementary information for learning the IRB. They not only contain high accuracy normal
481
regions of FBL , but also provide possible normal regions to replace lesion regions. Thus, com-
482
paring with any rough background FBH , the IRB learned by the MSSCL algorithm will have
483
a much higher approximation accuracy and contain much less lesion regions. This is demon-
484
strated in Fig. 16(a). Fig. 16(a) shows that, by the MSSCL algorithm, the MAP value can be
485
enhanced to 0.8391, far better than the ones detected by any FBH .
486
Salient lesions in FBL will have diffusion in some rough backgrounds with small λ, and
487
therefore disturb the IRB learned by the MSSCL algorithm. By using the repeated learning
488
strategy, i.e., by repeatedly learning the IRB from the modified FBL in which known salient
489
lesions have been suppressed, the effects of salient lesions can be well suppressed from the
490
learned IRB, and the approximation accuracy of the learned IRB can be further improved. In
491
such case, more weak lesions can be detected, and more weak false positives can be suppressed.
492
This is demonstrated in Fig. 16(a). In Fig. 16(a), by using the repeated learning strategy, the
493
MAP value can be further enhanced to 0.8694.
Figure 17: (a)-(b) ROC curves and PR curves of the proposed method, corresponding to different values of the parameter n.
494
6.3. On the robustness of parameters
495
In the proposed method, there are three important parameters: T , m, n. Based on the
496
set H, we quantitatively evaluate the effects of their different selections on the lesion detection 26
497
results. A single control variable method is used, in which only one of the three parameters T ,
498
m, n is changed, and the other two parameters remain fixed. Here, the fixed parameters will
499
select the constant values: m = 8, n = 6 and T = 20. In Fig. 16(b), Fig. 17 and Fig. 18,
500
the precision-recall curves corresponding to different thresholds (10, 12, 14, 16, 18, 20, 22, 24, 26),
501
different m (6, 7, 8, 9, 10) and different n (2, 4, 6, 8, 10) are shown, respectively. The heuristic
502
study in Fig. 16(b), Fig. 17 and Fig. 18 shows that, when a parameter is arbitrarily given
503
different values from a comparatively large range, the final AUC and MAP values remain nearly
504
unchanged at different values, showing the robustness of the three parameters. Therefore, we
505
fix T = 20, m = 8, n = 6 to segment lesions and other abnormal regions from all retinal images.
Figure 18: (a)-(b) ROC curves and PR curves of the proposed method, corresponding to different values of the parameter m.
506
7. Conclusion
507
Detecting a diverse range of abnormal regions from retinal images is a key step towards
508
automatic screening of retinal diseases, and yet is a challenging task due to the reasons including
509
(1) a retinal image may contain different lesions with a variety of shapes, sizes, colors, and
510
textural patterns; (2) the number and the type of retinal lesions in a retinal image are generally
511
unknown in advance.
512
This paper addresses these challenges with a novel and effective method. Our method learns
513
individualized retinal background (IRB), i.e., the characteristics of normality, for detection and
514
separation of abnormal regions of different types and numbers. This is considered fundamentally 27
515
different from the current disease-specific CAD techniques that detect pre-defined type of lesions
516
based on learning characteristics of the given type of abnormality from lesion samples or lesion
517
features.
518
IRB is learned from normal retinal images by a novel multi-scale sparse coding based learning
519
(MSSCL) of the retinal test image. From this multiscale sparse encoding with different levels of
520
sparse regularization, an adaptive dictionary is subsequently constructed to embrace different
521
levels of background details, which forms a unified background space containing more complete
522
information for IRB learning. Low-rankness is inherently contained in this background space,
523
and the IRB can be well learned by RPCA from the background space. The IRB learned by
524
MSSCL thus can achieve an accuracy far higher than the usual method that learns IRB by
525
directly encoding sparsely retinal images. Some of multi-scale sparse coding results, however,
526
can partially encode information of salient lesions. Therefore, MSSCL is repeatedly applied
527
to the modified test image in which salient lesions are suppressed, so as to further refine the
528
multiscale sparse coding and the learned IRB.
529
The experiments in this paper demonstrate that it is feasible to detect diverse retinal ab-
530
normal regions from a retinal image through the refined learning of its individualized IRB from
531
normal retinal images. Our further research will continue to distinguish some weak lesions from
532
normal individual differences in retinal images, and to learn patterns of different types of the
533
detected retinal abnormal regions to provide a unified learning framework for diagnoses of a
534
variety of lesions. Main ideas in this paper might be used for reference or may even be effective
535
for detecting various abnormal regions in some other medical images. Thus, we will also explore
536
the possibility of extending the proposed method to soother medical images. .
537
Acknowledgment
538
This work was supported in part by NSFC of China (61375020, 61572317), Shanghai Intel-
539
ligent Medicine Project (2018ZHYL0217), and SJTU Translational Medicine Cross Research
540
Fund (ZH2018QNA05).
28
541
References
542
[1] G. Quellec, M. Lamard, A. Erginay, A. Chabouis, P. Massin, B. Cochener, G. Cazuguel,
543
Automatic detection of referral patients due to retinal pathologies through data mining,
544
Medical Image Analysis 29 (2016) 47–64.
545
[2] M. R. K. Mookiah, U. R. Acharya, C. K. Chua, C. M. Lim, E. Ng, A. Laude, Computer-
546
aided diagnosis of diabetic retinopathy: A review, Computers in Biology and Medicine
547
43 (12) (2013) 2136–2155.
548
[3] U. Schmidt-Erfurth, A. Sadeghipour, B. S. Gerendas, S. M. Waldstein, H. Bogunovic,
549
Artificial intelligence in retina, Progress in Retinal and Eye Research 67 (2018) 1–29.
550
[4] N. Weiss, D. Rueckert, A. Rao, Multiple sclerosis lesion segmentation using dictionary
551
learning and sparse coding, in: International Conference on Medical Image Computing
552
and Computer-Assisted Intervention, Springer, 2013, pp. 735–742.
553
[5] M. J. van Grinsven, B. van Ginneken, C. B. Hoyng, T. Theelen, C. I. S´anchez, Fast convo-
554
lutional neural network training using selective data sampling: Application to hemorrhage
555
detection in color fundus images, IEEE Transactions on Medical Imaging 35 (5) (2016)
556
1273–1284.
557
[6] O. Ronneberger, P. Fischer, T. Brox, U-net: convolutional networks for biomedical image
558
segmentation, in: International Conference on Medical Image Computing and Computer-
559
Assisted Intervention, Springer, Munich, Germany, 2015, pp. 234–241.
560
[7] S. Ali, D. Sidib´e, K. M. Adal, L. Giancardo, E. Chaum, T. P. Karnowski, F. Meriaudeau,
561
Statistical atlas based exudate segmentation, Computerized Medical Imaging and Graphics
562
37 (5) (2013) 358–368.
563
[8] B. Chen, L. Wang, J. Sun, H. Chen, Y. Fu, S. Lan, Y. Huang, Z. Xu, Diverse lesion
564
detection from retinal images by subspace learning over normal samples, Neurocomputing
565
297 (2018) 59–70.
29
566
567
568
569
570
571
[9] R. Wang, B. Chen, D. Meng, L. Wang, Weakly-supervised lesion detection from fundus images, IEEE Transactions on Medical Imaging 38 (6) (2019) 1501–1512. [10] I. Lazar, A. Hajdu, Retinal microaneurysm detection through local rotating cross-section profile analysis, IEEE Transactions on Medical Imaging 32 (2) (2013) 400–407. [11] M. U. Akram, S. Khalid, S. A. Khan, Identification and classification of microaneurysms for early detection of diabetic retinopathy, Pattern Recognition 46 (1) (2013) 107–116.
572
[12] D. Sidib, I. Sadek, F. Mriaudeau, Discrimination of retinal images containing bright lesions
573
using sparse coded features and svm, Computers in Biology and Medicine 62 (2015) 175–
574
184.
575
576
[13] S. S. Kar, S. P. Maity, Automatic detection of retinal lesions for screening of diabetic retinopathy, IEEE Transactions on Biomedical Engineering 65 (3) (2018) 608–618.
577
[14] L. B. Frazao, N. Theera-Umpon, S. Auephanwiriyakul, Diagnosis of diabetic retinopathy
578
based on holistic texture and local retinal features, Information Sciences 475 (2019) 44–66.
579
[15] V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venu-
580
gopalan, K. Widner, T. Madams, J. Cuadros, Development and validation of a deep learn-
581
ing algorithm for detection of diabetic retinopathy in retinal fundus photographs, JAMA
582
316 (22) (2016) 2402–2411.
583
584
585
586
587
588
[16] G. Quellec, K. Charrire, Y. Boudi, B. Cochener, M. Lamard, Deep image mining for diabetic retinopathy screening, Medical Image Analysis 39 (2017) 178–193. [17] M. D. Abr`amoff, M. K. Garvin, M. Sonka, Retinal imaging and image analysis, IEEE reviews in Biomedical Engineering 3 (2010) 169–208. [18] K. S. Deepak, J. Sivaswamy, Automatic assessment of macular edema from color retinal images, IEEE Transactions on Medical Imaging 31 (3) (2012) 766–776.
589
[19] T. Aokia, Y. Yamashita, K. Yamamoto, Y. Korogi, Usefulness of computerized method for
590
lung nodule detection on digital chest radiographs using similar subtraction images from
591
different patients, European Journal of Radiology 81 (5) (2012) 1062–1067. 30
592
[20] G. Erus, E. I. Zacharaki, C. Davatzikos, Individualized statistical learning from medical
593
image databases: application to identification of brain lesions, Medical Image Analysis
594
18 (3) (2014) 542–554.
595
[21] K. S. Deepak, N. V. K. Medathati, J. Sivaswamy, Detection and discrimination of disease-
596
related abnormalities based on learning normal cases, Pattern Recognition 45 (10) (2012)
597
3707–3716.
598
599
[22] B. Dai, X. Wu, W. Bu, Optic disc segmentation based on variational model with multiple energies, Pattern Recognition 64 (2017) 226–235.
600
[23] R. Kamble, M. Kokare, G. Deshmukh, F. A. Hussin, F. Meriaudeau, Localization of optic
601
disc and fovea in retinal images using intensity based line scanning analysis, Computers
602
in Biology and Medicine 87 (1) (2017) 382–396.
603
604
605
606
607
608
609
610
611
612
613
614
615
616
[24] R. J. Radke, S. Andra, O. Al-Kofahi, B. Roysam, Image change detection algorithms: a systematic survey, IEEE Transactions on Image Processing 14 (3) (2005) 294–307. [25] X. Wang, X. Jiang, J. Ren, Blood vessel segmentation from fundus image by a cascade classification framework, Pattern Recognition 88 (2019) 331–341. [26] A. Telea, An image inpainting technique based on the fast marching method, Journal of Graphics Tools 9 (1) (2003) 25–36. [27] J. Mairal, F. Bach, J. Ponce, G. Sapiro, Online learning for matrix factorization and sparse coding, Journal of Machine Learning Research 11 (1) (2010) 19–60. [28] Y. Yuan, X. Feng, X. Lu, Structured dictionary learning for abnormal event detection in crowded scenes, Pattern Recognition 73 (2018) 99–110. [29] M. Elad, Sparse and redundant representations: From theory to applications in signal and image processing, Springer, New York, 2010. [30] Z. Lu, L. Wang, Noise-robust semi-supervised learning via fast sparse coding, Pattern Recognition 48 (2) (2015) 605–612.
31
617
[31] Y. Fu, C. Wang, Y. Wang, B. Chen, Q. Peng, L. Wang, Automatic detection of longitudinal
618
changes for retinal fundus images based on low-rank decomposition, Journal of Medical
619
Imaging and Health Informatics 8 (2) (2018) 284–294.
620
621
[32] X. Li, Y. Ma, J. Wright, Robust principal component analysis?, Journal of the ACM 58 (3) (2011) 11–48.
622
[33] H. Yong, D. Meng, W. Zuo, L. Zhang, Robust online matrix factorization for dynamic
623
background subtraction, IEEE Transactions on Pattern Analysis and Machine Intelligence
624
40 (7) (2017) 1726–1740.
625
626
[34] Y. W. Wen, R. H. Chan, A. M. Yip, A primal-dual method for total-variation-based wavelet domain inpainting, IEEE Transactions on Image Processing 21 (1) (2011) 106–114.
627
[35] T. Walter, J.-C. Klein, P. Massin, A. Erginay, A contribution of image processing to the
628
diagnosis of diabetic retinopathy-detection of exudates in color fundus images of the human
629
retina, IEEE Transactions on Medical Imaging 21 (10) (2002) 1236–1243.
630
[36] Q. Zhao, D. Meng, Z. Xu, W. Zuo, L. Zhang, Robust principal component analysis with
631
complex noise, in: International Conference on Machine Learning, IEEE, Beijing, China,
632
2014, pp. 55–63.
633
[37] D. Meng, Z. Xu, L. Zhang, J. Zhao, A cyclic weighted median method for l1 low-rank
634
matrix factorization with missing entries, in: AAAI, AAAI Press, Bellevue, Washington,
635
2013, pp. 704–710.
636
[38] X. Cao, Y. Chen, Q. Zhao, D. Meng, Y. Wang, D. Wang, Z. Xu, Low-rank matrix factor-
637
ization under general mixture noise distributions, in: IEEE International Conference on
638
Computer Vision, IEEE, Santiago, Chile, 2015, pp. 1493–1501.
32
639
Benzhi Chen received the B.S. degree in Mechanical Manufacturing and Automation from
640
Xiangfan University in 2010, and the M.S. degree in Mechanical Engineering from Hunan Uni-
641
versity in 2013 and Ph.D. degree in Image Processing and Pattern Recognition in Shanghai Jiao
642
Tong University in 2018. His main interests include medical image analysis, computer vision
643
and machine learning.
644
645
Lisheng Wang received the M.S. degree in mathematics and the Ph.D. degree in elec-
646
tronic and information engineering from Xi’an Jiaotong University, China, in 1993 and1999,
647
respectively. In 2003, he joined Department of Automation, Shanghai Jiao Tong University,
648
China, and now is a Professor. His research interests include analysis and visualization of 3D
649
biomedical images, computer-aided imaging diagnosis and surgery planning.
650
651
Xiuying Wang received the Ph.D. degree in computer science from The University of
652
Sydney, Camperdown, NSW, Australia. She is currently associate professor and Associate Di-
653
rector in the Multimedia Lab, School of Information Technologies, The University of Sydney.
654
Her research interests include biomedical data computing and visual analytics, biomedical im-
655
age registration, identification, clustering and segmentation.
656
657
Jian Sun received the B.S. degree from the University of Electronic Science and Technology
658
of China in 2003 and the Ph.D. degree in applied mathematics from Xian Jiaotong University
659
in 2009. He worked as a visiting student in Microsoft Research Asia from November 2005 to
660
March 2008, a postdoctoral researcher in University of Central Florida from August 2009 to
661
April 2010, and a postdoctoral researcher in willow project team of ’Ecole Normale Sup’erieure
662
de Paris and INRIA from Sept. 2012 to August 2014. He now serves as a professor in the school
663
of mathematics and statistics of Xi’an Jiaotong University. His current research interests are
664
image processing, medical image analysis and machine learning.
665
666
Yijie Huang received his B.S. degree in Automation from Huazhong University of Science
667
and Technology in 2015. He is now a Ph.D. candidate in image processing and pattern recog-
33
668
nition in Shanghai Jiao Tong University. His main interests include deep learning and medical
669
image analysis.
670
671
David Feng received the Ph.D. degree in computer science from the University of Califor-
672
nia, Los Angeles, CA, USA, in 1988. He is currently the Director (Research) with the Institute
673
of Biomedical Engineering & Technology, and an Academic Director with USYD-SJTU Joint
674
Research Alliance. He has been the Head in the School of Information Technologies, Faculty
675
of Engineering and Information Technologies, and an Associate Dean of the Faculty of Science,
676
University of Sydney, Camperdown, NSW, Australia. He has been the Chair Professor, Advi-
677
sory Professor, Guest Professor, Adjunct Professor, or Chief Scientist in different world-known
678
universities and institutes. He is the Founder and Director of the Biomedical & Multimedia
679
Information Technology Research Group at the University of Sydney. Dr. Feng has served as
680
Chair or Editor of different committees and key journals. He has been elected as a Fellow of
681
ACS (Australia), HKIE (Hong Kong), IET (UK), IEEE (USA), and the Australian Academy
682
of Technological Sciences and Engineering.
683
684
Zongben Xu received the M.S. degree and Ph.D. degree in mathematics from Xi’an Jiao-
685
tong University, China, in 1981 and 1987, respectively. He has been with Xi’an Jiaotong
686
University since 1982, and was promoted to Professor in 1991. He is the Academician of the
687
Chinese Academy of Sciences. His current research interests include intelligent information
688
processing, computational vision and machine learning.
34