Abnormality detection in retinal image by individualized background learning

Abnormality detection in retinal image by individualized background learning

Abnormality Detection in Retinal Image by Individualized Background Learning Journal Pre-proof Abnormality Detection in Retinal Image by Individuali...

9MB Sizes 0 Downloads 11 Views

Abnormality Detection in Retinal Image by Individualized Background Learning

Journal Pre-proof

Abnormality Detection in Retinal Image by Individualized Background Learning Benzhi Chen, Lisheng Wang, Xiuying Wang, Jian Sun, Yijie Huang, Dagan Feng, Zongben Xu PII: DOI: Reference:

S0031-3203(20)30015-7 https://doi.org/10.1016/j.patcog.2020.107209 PR 107209

To appear in:

Pattern Recognition

Received date: Revised date: Accepted date:

17 March 2019 22 August 2019 16 January 2020

Please cite this article as: Benzhi Chen, Lisheng Wang, Xiuying Wang, Jian Sun, Yijie Huang, Dagan Feng, Zongben Xu, Abnormality Detection in Retinal Image by Individualized Background Learning, Pattern Recognition (2020), doi: https://doi.org/10.1016/j.patcog.2020.107209

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. © 2020 Published by Elsevier Ltd.

1

2

3

4

5

6

Highlights • A multi-scale sparse coding based learning algorithm is proposed for effectively learning the individualized retinal background; • A repeated learning strategy is proposed for improving the accuracy of the individualized retinal background; • A feasible approach is developed for detecting both salient and weak retinal lesions.

1

Abnormality Detection in Retinal Image by Individualized Background Learning

7

8

Benzhi Chena , Lisheng Wanga,∗, Xiuying Wangb , Jian Sunc , Yijie Huanga , Dagan Fengb , Zongben Xuc

9 10

11 12 13 14

a

Department of Automation, Shanghai Jiao Tong University, and the Key Laboratory of System Control and Information Processing, Ministry of Education, Shanghai, 200240, P. R. China b School of Information Technologies, The University of Sydney, Sydney, Australia c School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an 710049, China

15

Abstract

16

Computer-aided lesion detection (CAD) techniques, which provide potential for automatic early

17

screening of retinal pathologies, are widely studied in retinal image analysis. While many CAD

18

approaches based on lesion samples or lesion features can well detect pre-defined lesion types,

19

it remains challenging to detect various abnormal regions (namely abnormalities) from retinal

20

images. In this paper, we try to identify diverse abnormalities from a retinal test image by finely

21

learning its individualized retinal background (IRB) on which retinal lesions superimpose.

22

3150 normal retinal images are collected as the priors for IRB learning. A preprocessing

23

step is applied to all retinal images for spatial, scale and color normalization. Retinal blood

24

vessels, which have individual variations in different images, are particularly suppressed from all

25

images. A multi-scale sparse coding based learning (MSSCL) algorithm and a repeated learning

26

strategy are proposed for finely learning the IRB. By the MSSCL algorithm, a background space

27

is constructed by sparsely encoding the test image in a multi-scale manner using the dictionary

28

learned from normal retinal images, which will contain more complete IRB information than

29

any single-scale coding result. From the background space, the IRB can be well learned by

30

low-rank approximation and thus different salient lesions can be separated and detected. The

31

MSSCL algorithm will be iteratively repeated on the modified test image in which the detected

32

salient lesions are suppressed, so as to further improve the accuracy of the IRB and suppress

33

lesions in the IRB. Consequently, a high-accuracy IRB can be learned and thus both salient

34

lesions and weak lesions that have low contrasts with the background can be clearly separated.

35

The effectiveness and contributions of the proposed method are validated by experiments over ∗

Corresponding author: [email protected] Preprint submitted to Pattern Recognition

January 16, 2020

36

different clinical data-sets and comparisons with the state-of-the-art CAD methods.

37

Keywords: Retinal abnormality detection, Retinal lesion detection, Computer-aided

38

detection, Dictionary learning, Background learning, Retinal image reading

39

1. Introduction

40

Retinal pathologies such as drusen, diabetic retinopathy and macular degeneration, either

41

primary or associated with other diseases, age-related or occurring in young children, are the

42

main cause of visual loss. With the wide availability of retinal imaging scanners, early screening

43

becomes a universal approach to significantly reduce the risk of blindness [1].

44

To reduce the workload of the medical experts and the costs incurred in the routine screen-

45

ing, in the past two decades, extensive efforts on developing computer-aided lesion detection

46

(CAD) have been devoted for more efficient and automated lesion detection from a large quan-

47

tity of medical images [2, 3]. Either trained with sample images or based on the image features

48

of specific types of lesions, the current CAD techniques are inherently designed to be disease-

49

customized. While a disease-customized CAD may provide an effective computational solution

50

to detect a predefined type of lesion from medical images, it often lacks the capacity to detect

51

different types of lesions from the same medical image. From a clinical-diagnosis perspective,

52

this is a serious flaw, and CAD techniques for detecting various types of lesions are needed. In

53

this paper, we develop an effective technique to detect various types of abnormal regions from

54

retinal images.

55

Automated detection of a diverse range of abnormal regions from retinal images is generally

56

a challenging research initiative, due to the major reasons we outline below. As illustrated in

57

Fig. 1, (1) different types of retinal lesions might be contained in a single retinal image, and

58

they often display substantial variability in terms of shape, size, color, textures and locations;

59

(2) retinal blood vessel structures may exhibit significant differences across different individuals,

60

and there are also individual differences among the health areas of similar location in different

61

images; (3) images with similar features albeit in different regions can have different evaluations

62

of abnormality; and (4) the number and the type of retinal lesions in a retinal image are generally

63

unknown in advance, and different retinal images often have different imaging qualities. These

64

factors pose significant challenges for a computational approach to detect and differentiate the 3

Figure 1: A retinal image may contain multiple types of lesions ((a)-(e)), and different retinal images may contain different types of lesions ((f)-(j)). Different lesions can exhibit diverse properties in shapes, sizes, colors, and positions.

65

diverse abnormal regions from retinal images.

66

Experienced ophthalmologists, however, can visually detect various abnormal regions from

67

retinal images even if they have a variety of different imaging features. In particular, they can

68

detect types of abnormalities never seen before. This human intelligence has been built upon

69

years of training and ample prior knowledge. In the training procedure, they observe a large

70

number of normal retinal images and learn their characteristics. When reading a retinal image,

71

ophthalmologists implicitly compare the case with their prior knowledge of any similar normal

72

cases, and accordingly make a judgement. Inspired by such observations, we collected 3150

73

normal retinal images (denoted by Ψ) to serve as the prior knowledge of normal retinal images.

74

We will detect various retinal abnormalities from retinal images based on such priors rather

75

than lesion samples or lesion features.

76

Even between normal retinal images, significant individual differences can exist, such as

77

blood vessel structure. Additionally, different imaging conditions, image colors, image sizes and

78

spatial positions of anatomical structures can contribute to large differences among different

79

retinal images, as shown in Fig. 2. In order to suppress bias caused by these factors, a

80

preprocessing step is applied to all retinal images, normalizing image scales, image colors and

81

spatial positions of anatomical structures. In particular, retinal blood vessels, which usually

82

have large individual variations in different retinal images, are suppressed from all images,

83

and their regions are smoothly filled based on the colors of the surrounding region. In such a

4

Figure 2: Normal retinal images from different persons. They may have different colors, scales and are usually not aligned spatially. Blood vessels in them may also have great individual differences. 84

case, a retinal test image with abnormal regions can be regarded as the superposition of two

85

different parts: abnormal regions and the normal retinal background, as illustrated in Fig. 3.

86

Therefore, for the given retinal test image, retinal abnormality detection can be converted into

87

the following problem: how to compute its corresponding normal retinal background? Once we

88

have the retinal background of a retinal test image, then various retinal abnormal regions can

89

be detected and separated from the image. In this paper, for each given retinal test image, we

90

learn its high-accuracy individualized retinal background (IRB), to detect both salient lesions

91

and weak lesions that have low contrasts with the IRB from the test image.

Figure 3: After removing blood vessels, a retinal image with lesions (i.e. (b)) can be regarded as the superposition of lesions (i.e. (d)) on its retinal background (i.e. (c)). (a) the original retinal image. (b) the retinal image without blood vessels. (c) the retinal background. (d) lesion regions.

92

We develop a novel computational approach to learn the IRB from the retinal test image

93

and Ψ, in which a multi-scale sparse coding based learning (MSSCL) algorithm and a repeated

94

learning strategy are proposed for the fine learning of the IRB. The MSSCL algorithm con-

95

sists of two steps. First, a dictionary is learned from normal retinal images Ψ and the test 5

96

image is sparsely encoded in a multi-scale way. Here, any single-scale coding result is only a

97

rough approximation of the IRB, either containing no lesions but having a low approximation

98

accuracy, or having a high accuracy but containing many lesions. Multi-scale coded results,

99

however, together construct a background space that will contain more complete information

100

for IRB learning. Second, the IRB is well learned from the background space by low-rank

101

approximation, and thus different salient lesions are separated and detected from the retinal

102

test image. Some images among the multi-scale coded results, however, may encode some infor-

103

mation of salient lesions, and these lesions will be reflected in the IRB learned by the MSSCL

104

algorithm. Thus, the repeated learning strategy is further applied, by which the MSSCL algo-

105

rithm is iteratively repeated on the modified test image in which the detected salient lesions

106

are suppressed, so as to further improve the accuracy of the IRB and suppress lesions in the

107

IRB. Consequently, a high-accuracy IRB containing almost no lesions can be learned and thus

108

both salient lesions and weak lesions can be well separated from the retinal test image. The

109

effectiveness and advantages of the proposed method are validated by many experiments and

110

comparisons with the state-of-the-art CAD methods, such as those in [4, 5, 6, 7, 8, 9].

111

The main contributions in this paper are as follows: (i) The MSSCL algorithm is proposed

112

for effectively learning the IRB. (ii) A repeated learning strategy is proposed for improving the

113

accuracy of the IRB. (iii) A feasible approach is developed for detecting both salient and weak

114

retinal lesions.

115

2. Related works

116

The existing CAD techniques for retinal lesion detection can be mainly divided into either

117

feature-based or training-based methods. For the former, predefined types of lesions are de-

118

tected from retinal images by handcrafted features [10, 11, 12, 13, 14]. For instance, peaks of

119

directional cross-section profiles centered on local maximum pixels of the image, the statistical

120

measures of size, and height and shape of the peaks, are used as the feature set of a Bayesian

121

classifier for detecting retinal microaneurysms in [10]. In [11], five handcrafted features in-

122

cluding holistic texture features and the local retinal features are specifically applied to the

123

predefined types of lesions. For the latter, samples are often collected for the specific type of

6

124

retinal lesions, and then the lesion features are learned from these samples for detecting the

125

specific type of lesions from retinal images [5, 15, 16]. These methods are basically customized

126

for a predefined type of lesion and thereby may lack the compatibility or capacity for detecting

127

other types of retinal lesions. In particular, they cannot detect the retinal lesions of unseen

128

type. Furthermore, different types of retinal lesions have different shapes, sizes, colors, textures

129

and positions. Thus, it is difficult to learn the common features of all types of retinal lesions. In

130

addition, it is difficult to collect sufficient training data with expert-labelled lesion regions for

131

all possible lesion types. Thus, the CAD techniques based on lesion samples or lesion features

132

are facing challenges for detecting various abnormal regions from retinal images. This motivates

133

researchers to develop new methods for retinal abnormality detection, without utilizing retinal

134

lesion samples or lesion features.

135

More recently, multiple lesion detection based on normal images has been investigated in

136

[4, 7, 8, 9, 17, 18, 19, 20, 21]. In [7, 17], an atlas or an average retinal model is computed

137

from normal images, and abnormalities are detected by comparing each test image with the

138

atlas or the average model. The fixed atlas or average model, however, usually cannot adapt

139

to individual differences in different normal images, and hence, weak lesions cannot be easily

140

distinguished from normal individual differences. In [18, 21], global features are learned from

141

normal medical images, which are then used to discriminate normal images from abnormal ones

142

with specific types of lesions. However, such features are not suitable for detecting all types of

143

lesions. In [19], a large database of normal chest radiographs is collected, and lung nodules are

144

identified by subtracting a similar image founded in the database from the target image. Since

145

this method is based on a huge database with approximately 15000 normal chest radiographs,

146

its performance may vary when working with a much smaller database. In [20], a hyper-volume

147

is generated from a set of normal MRI brain images, and abnormalities are detected by mapping

148

the image into the hyper-volume. While this method can be used to detect multiple lesions

149

from MRI brain images, its detection accuracy has a dice value of less than 0.67, which needs

150

to be further improved. In [4], multiple sclerosis lesions are segmented from brain MRI images

151

based on normal brain MRIs. In this method, all normal brain MRI images are divided into

152

different small patches (3*3*3) to form a training set, from which a dictionary is learned and

7

153

used to reconstruct each patch of the test image. The regions with large reconstruction errors

154

are then marked as lesions. This method, however, may face challenges for retinal images

155

because many retinal abnormal regions can also be reconstructed with small errors. In [8], the

156

retinal background is reconstructed by orthogonal basis vectors learned from normal retinal

157

images rather than their small patches, and abnormal regions are marked by the reconstruction

158

residual. However, it is challenging for this method to detect both large salient retinal lesions

159

and weak lesions. The method in [9] detects retinal abnormality in a weakly-supervised manner,

160

but the learned retinal background is too smooth to contain normal individual differences. The

161

current methods on background learning are facing challenges on either distinguishing weak

162

lesions from many false positives or losing salient lesions.

163

In comparison to the aforementioned methods based on lesion samples and lesion features,

164

our proposed method is able to more effectively detect various abnormal regions including

165

unknown types from retinal images. Comparing with the methods for multiple lesion detection,

166

this proposed method has the capacity to learn a highly accurate IRB that rarely contains any

167

lesions form the test image. Our method is capable of detecting both salient and weak lesions

168

from the retinal test image.

Figure 4: The flowchart of the preprocessing step. (a)-(b) input the reference image and a retinal image. (c) normalization in scale and spatial position. (d) normalization in color. (e) removing blood vessels and filling their regions with surrounding colors.

169

3. Preprocessing of retinal images

170

Retinal images in Ψ can not be directly used for retinal background learning. A preprocess-

171

ing will be first applied to all retinal images. Let F denote a retinal test image with various

172

abnormal regions. For all retinal images in Ψ ∪ {F }, their imaging bias due to different imaging

173

conditions will be suppressed by normalizing them in scales, colors and spatial positions, and 8

174

the blood vessels will be removed from all the images and their regions are smoothly filled-up

175

by around colors. The flowchart of the preprocessing is illustrated in Fig. 4.

176

Spatial alignment - The spatial alignment enables us to compare similar anatomical regions

177

in all different retinal images. Each retinal image has two centers: the center of the optic

178

disk and the center of the fovea [22][23], as illustrated in Fig. 3(a). The two centers can be

179

automatically detected from retinal images by the U-net deep network [6]. In this paper, all

180

retinal images in Ψ∪{F } are aligned spatially according to these two centers. In the meanwhile,

181

the image scales are also normalized.

182

Color normalization - The normalization in image colors and scales ensures that we can

183

process retinal images taken under different imaging conditions. The colors of all retinal im-

184

ages are normalized by the following formula (1)[24], and thus these images have similar color

185

distributions as a reference retinal image with the mean µ1 and the variance σ1 :

Fnew = 186

187

σ1 (F2 − µ2 ) + µ1 σ2

(1)

where, F2 is a fundus image with the mean µ2 and the variance σ2 , and F1 is its normalized image.

188

Blood vessels removal - Experienced doctors know anatomical structures in the image, so

189

they can easily ignore individual differences of the same anatomical structure in different per-

190

sons. The blood vessels removal can reach such an effect. Individual differences of blood vessels

191

are suppressed by firstly detecting and removing blood vessels from all retinal images and then

192

filling blood vessel regions with the neighboring values. Here, blood vessels are detected by the

193

methods in [6] [25], and blood vessel regions are filled by the method in [26].

194

195

To some extent, the preprocessing above mimics the normalization operations implicitly performed in the human vision system.

196

After the above preprocessing, the test image F becomes FBL , that is the superposition of

197

various abnormal regions (denoted by FL ) on the retinal background of F (denoted by FB ).

198

FB is just the IRB we try to compute. Meanwhile, each normal retinal image in Ψ becomes its

199

retinal background. Therefore, Ψ becomes the set ΨB that includes retinal backgrounds of all

200

normal images in Ψ. FBL of some abnormal retinal images are shown in Fig. 5(a)-(d), and some 9

Figure 5: Pre-processing results of some retinal images. (a)-(d) are FBL of four abnormal images in Figs. 1(a)-(d), respectively. (e)-(h) are background images of four different normal images in Ψ.

201

typical background images in ΨB are shown in Fig. 5(e)-(h). FL will be separated from FBL

202

by learning the IRB FB from FBL based on the prior ΨB . For the convenience of discussion,

203

all images in ΨB ∪ {FBL } are converted into gray-scale images by extracting the green channel

204

of each color image. In retinal images, the green channel usually provides a better contrast

205

between retinal lesions and background regions, as shown in [9].

206

4. A computational approach for learning the IRB

207

The IRB FB can be regarded as an image that satisfies the following conditions:

208

(i) FBL = FL ⊕ FB , ⊕ means the superposition of FL onto FB .

209

(ii) ΨB are some priors of FB . In other words, FB is similar as images in ΨB , but has

210

individual differences.

211

This means that, the IRB actually includes two kinds of normal regions: the normal regions

212

originally belonging to the test image, and the normal regions that are intended for replacing

213

lesion regions of the test image. In this section, a computational approach is proposed for

214

learning the IRB FB (i.e., its two kinds of normal regions) from FBL and the prior ΨB . In the

215

approach, a multi-scale sparse coding based learning (MSSCL) algorithm is first proposed to

216

learn the IRB. Then, a repeated learning strategy is applied to improve the accuracy of the

217

IRB and to suppress lesions in the IRB. As a result, a high-accuracy IRB can be learned. The

218

flowchart of this computational approach is shown in Fig. 6. 10

Figure 6: The flowchart of the proposed computational approach. First, the training set ΨB and the test image FBL are input. Second, a dictionary D is learned from ΨB , and a rough background space Θrough is constructed by sparsely coding FBL with D in a multi-scale way. Third, the initial IRB FU and the sparse image FV (in which large salient lesions are highlighted) are learned from FBL based on its low-rank decomposition over ΘRough . Fourth, FBL is modified as FBL1 by suppressing known large salient lesions from it, and FBL1 is sparsely coding in a multi-scale way again. Consequently, a refined background space ΘF ine is constructed. Finally, a fine IRB FP and the refined sparse image FQ (in which various abnormal lesions are highlighted) are learned from FBL based on its low-rank decomposition over ΘF ine . Here, the second and third steps constitute the MSSCL algorithm, and the last two steps constitute the repeated learning strategy.

219

4.1. The MSSCL algorithm for learning the IRB

220

4.1.1. Sparse coding of the test image and its encoding properties

221

In this section, a dictionary is learned from the set ΨB , and sparse coding properties of FBL

222

by the dictionary are discussed. All retinal backgrounds in ΨB are vectorized and normalized as

223

unit column vectors, and are concatenated together to form a low-rank matrix M . The dictio-

224

nary D = (d1 , d2 , . . . , dk ) can be learned from M by solving the following matrix factorization

225

problem [27][28]: n

min

D∈C,αi ∈Rk

1X1 kxi − Dαi k22 + βkαi k1 n i=1 2

(2)

226

where, dj is an atom in the dictionary D, k is the number of atoms, C = {D ∈ Rm×k , s.t.kdj k22 ≤

227

1}, β denotes a sparse regulation parameter, αi is a sparse coefficient vector, and xi is a column

228

vector in M . In this paper, a much small β and a large k (β = 0.001 and k = 300) are set so

229

as to learn a dictionary with a strong representation ability [29, 30].

230

After having learned the dictionary D from ΨB , the sparse coding properties of FBL by the

231

dictionary will be studied. Here, the sparse coding of FBL refers to the sparse reconstruction 11

232

of FBL by the sparse combination of atoms in D. The sparse coding coefficients can be derived

233

by optimizing [29]: 1 min kFBL − Dαk22 + λkαk1 k α∈R 2

(3)

234

where, λ is the sparsity regulation parameter, and α the optimal sparse coding coefficient vector.

235

Suppose that the optimal sparse coding coefficients are α = w1 , w2 , w3 , . . . , wk , then FBL can

236

be sparsely coded (or approximated) by the following equation:

FBH =

k X i=1

237

w i · di

(4)

where di is the i-th atom in D, i = 1, . . . , k.

238

The sparsity regulation parameter λ can control the encoded contents of FBL in FBH , and

239

its different settings determine variant rough backgrounds FBH of FB . There are the following

240

properties:

241

242

(i) When λ selects a large value, many image details including lesion regions are suppressed from FBH , but FBH have a low approximation accuracy to the IRB.

243

(ii) When λ selects a small value, FBH have a high approximation accuracy to the IRB

244

and contain many image details of FBL , but many lesion regions will also appear in FBH .

245

Particularly, large salient lesions can have severe diffusion in FBH .

246

(iii) With the decrease of λ, while the normal regions originally belonging to FBL are re-

247

constructed in FBH with the gradually-increasing accuracy, lesion regions will gradually appear

248

in FBH ; with the increase of λ, while lesion regions are gradually suppressed in FBH , the nor-

249

mal regions originally belonging to FBL are reconstructed in FBH with the gradually-increasing

250

error.

251

These encoding properties are demonstrated in Fig. 7. Thus, for any given λ, the corre-

252

sponding FBH cannot well represent the IRB FB , and is only a rough background. All different

253

rough backgrounds, i.e., the set Θ = {FBH : λ ∈ (0, ∞)}, however, together provide a comple-

254

mentary and complete information for the IRB learning. For example, in Θ, the information

255

for learning the two kinds of normal regions of the IRB are implicitly contained in different

256

rough backgrounds (with small λ and with large λ), respectively. Therefore, Θ will be used as

12

Figure 7: Sparse coding results of FBL by different λ. (a) two different FBL . (b)-(d) all lesions can be suppressed in FBH when λ selects large values. (e)-(h) large salient lesions can gradually diffuse into large regions (in red circles) when λ selects small values.

257

the background space for learning the IRB, from which the two kinds of normal regions of the

258

IRB will be learned.

259

4.1.2. IRB learning with the MSSCL algorithm

260

In this section, we introduce how the IRB will be learned from the background space Θ.

261

First, a discrete approximation of Θ (i.e., a discrete background space) will be selected for the

262

computation. For the purpose, we need to select a series of specific values ε1 > ε2 > ......> εm

263

for λ so that their corresponding FBH construct a suitable discrete background space. We note

264

that, when εi (i = 1, 2, ..., m) select different values, the corresponding discrete background

265

spaces might represent different priors. For example,

266

(i) If most of εi , i = 1, 2, ..., m select large values, then the discrete background space will

267

be formed by many low-accuracy rough backgrounds. From which, normal regions originally

268

belonging to the test image cannot be learned in a high accuracy.

269

(ii) If most of εi , i = 1, 2, ..., m take small values, then the discrete background space will be

270

formed by many high-accuracy rough backgrounds in which salient lesions cannot be suppressed

271

and they will have large diffusions. From which, the normal regions intended for replacing

272

lesion regions can not be well learned, and the normal regions around salient lesions might be

273

incorrectly detected as abnormal ones.

274

275

Thus, in this paper, we particularly select discrete values for εi according to the following equation:

εi =

1 2i−2

, i = 1, 2, ..., m

13

(5)

276

By selecting m (in this paper, m = 8) different values in the equation (5) for λ, m different

277

rough backgrounds (denoted by B1 , B2 ,......, Bm , respectively) are sparsely reconstructed from

278

FBL . The values selected in the equation (5) have the following merits: (i) They ensure that B1 ,

279

B2 ,......, Bm are sparely coded in a multi-scale way, providing a compact and efficient discrete

280

representation of Θ. (ii) By suitably covering both large and small values, B1 , B2 ,......,Bm not

281

only make a good balance between suppressing lesion regions and learning high accuracy normal

282

regions, but also provide efficient information for learning the two kinds of normal regions of

283

the IRB.

284

Usually, intensity differences exist between the same normal regions in Bi and FBL . Thus,

285

we particularly compute n (in this paper, n = 6) additional images, denoted by the set A =

286

{A1 , A2 , ......, An ), by linearly interpolating FBL and a specific Bi with few lesions (heuristically,

287

Bi corresponding to λ =

288

1, 2, ..., n} are used to implicitly form a background space from which the IRB will be learned.

289

The interpolation operation above ensures that the intensity of normal regions will change

290

smoothly among the image series formed by all Bi , Aj and FBL [31].

1 4

is applied). The n + m images Θrough = {Bi , Aj : i = 1, 2, ..., m, j =

291

The m + n + 1 images Bi , Aj , FBL are vectorized and are concatenated together to form a

292

low-rank matrix ML . While salient lesions in FBL correspond to sparse structures in ML , the

293

IRB can be approximately regarded as the low-rank structure in ML . ML can be modelled as

294

the sum of the two matrices: ML = U +V , where U is the low-rank background component, and

295

V is the sparse component. U and V can be computed by solving the following optimization

296

problem:

arg min rank(U ) + α · kV k1 , s.t. ML = U + V U,V

(6)

297

where α is the sparse regularization parameter. Robust principal component analysis (RPCA)

298

can be used to compute approximately U and V from the equation (6) [32, 33]. RPCA takes the

299

normal regions with smooth transition in different images as the background components. As

300

a result of sparse decomposition, FBL is divided into two images: the background component

301

image FU and the sparse image FV .

302

Essentially, FU is learned from Θrough by low-rank approximation namely, the two kinds of 14

303

normal regions of the IRB are implicitly learned from their corresponding regions in Θrough by

304

the low-rank computation. Thus, normal regions in FBL can be well reconstructed in FU due

305

to their similarity to their corresponding regions in Θrough , and the normal regions intended for

306

replacing lesion regions are reconstructed in FU by the specific average of their corresponding

307

regions in Θrough . At the same time, salient lesions and many other lesions in FBL are separated

308

into and highlighted in FV and can be easily detected.

309

However, large salient lesions in FBL can be encoded in and have diffusions in some rough

310

backgrounds Bi with small λ, as shown in Fig. 6. As a result, FU can encode certain information

311

of salient lesions. Thus, some false positives, which are due to the diffusion effects of salient

312

lesions and individual differences of some normal regions, will exist in FV . Usually, weak

313

lesions cannot be well distinguished from them in FV . Thus, a further refinement on the set

314

B1 , B2 , ......, Bm is needed, so as to exclude encoded information of salient lesions in them.

315

4.2. Repeated learning strategy for improving the IRB

316

In this section, FBL is modified as FBL1 by suppressing detected salient lesions from FBL ,

317

and the IRB will be repeatedly learned from the FBL1 by the MSSCL algorithm. By such a

318

repeated learning strategy, the accuracy of the IRB can be further improved and lesions can

319

also be well suppressed from it.

320

Salient lesions removal from FBL – Salient lesions and some other lesion regions can be firstly

321

segmented from FV by the threshold operation with a threshold value T , and then followed by

322

hole-filling. Here, T can select any value from the range [12, 26], as shown in the heuristic study

323

of Fig. 16(b). Pixels whose values are larger than T will be marked as salient lesions. For the

324

large regions, we fill their blank regions with the mean intensity value of their corresponding

325

regions in all background images in ΨB . For the small regions, their blank areas are filled by the

326

intensity values around the regions using the algorithm in [34]. By such processing, a modified

327

FBL (i.e., FBL1 ) without salient lesions is generated, as shown in Fig. 8(b).

328

Repeated learning of the IRB – the rough backgrounds B1 , B2 ,......, Bm are updated by

329

learning them from FBL1 , and correspondingly, the additional image set A = {A1 , A2 , ......, An }

330

is recalculated. Here, the set B = {B1 , B2 , ......, Bm } can well exclude encoded information of

331

large salient lesions. The n + m images ΘF ine = {Bi , Aj : i = 1, 2, ..., m, j = 1, 2, ..., n} now 15

332

implicitly form a refined background space differing from Θrough , which will be used as the

333

discrete background space for learning a fine IRB from FBL . All Bi , Aj and FBL are vectorized

334

and concatenated to form a low-rank matrix (denoted by MBF ). Various abnormal regions in

335

FBL , such as salient lesions, weak lesions and other abnormal regions, all are sparse structures

336

in MBF .

Figure 8: (a) FBL . (b) FBL1 , in which salient lesion regions are suppressed. (c) FQ , in which various abnormal regions are well highlighted.

337

MBF can be decomposed into a low-rank background component and a sparse component by

338

the RPCA. Accordingly, FBL is now divided into two new images: the background component

339

image FP and the sparse image FQ . FP and FQ are updated versions of FU and FV . Comparing

340

with FU , FP not only contains less lesion regions but also approximates normal regions with

341

a higher accuracy. FP is a fine IRB, and FQ highlights various abnormal regions in FBL , as

342

shown in Fig. 8(c).

343

5. Results and Comparisons

344

5.1. Data set and Evaluation Methods

345

The proposed method is quantitatively evaluated using two different data sets. The first

346

data set (denoted by H) consists of 190 retinal images with manually labeled various lesions.

347

Here, 190 retinal images are originally from an open data set in Kaggle. In H, 17420 lesion

348

regions and more than 26 different types of lesions (such as choroidal old lesions, macular degen-

349

eration, drusen, crystalline retinopathy, acute retinal necrosis, vitreous degeneration, choroidal

350

inflammation, retinal arterial macroaneurysms, choroidal pigment, choroidal pigmented mole,

351

panretinal photocoagulation spot and unfamiliar lesions) are contained. While retinal images 16

Figure 9: Statistical distributions of different properties of retinal lesions in the data set H. (a) Size of different lesions. (b) Number of lesions in different images. (c) Gray values of different lesions. (d) Contrasts of different lesions with regard to their surrounding regions.

352

in H contain different types of lesions exhibiting diverse visual properties (as shown in Figs. 1,

353

11(a)-(r), 12(a)-(c)), the number of lesions in different retinal images varies from 1 to 585, the

354

size of lesions changes from 2 pixels to 1831 pixels, and both salient lesions and weak lesions

355

are contained, as illustrated in Figs. 9(a)(b)(c)(d), respectively. Hence, retinal images in H are

356

very representative, and they can be used to justify the generalization of the proposed method.

357

The second data set (denoted by MM) comprises manual labels of 110 retinal images with

358

multiple types of lesions. Here, 110 retinal images are originally from a large open data set

359

Messidor. In the data set MM, more than 7500 lesion regions and six different types of lesions

360

are contained. However, most of the lesions in the MM data set are diabetic-related lesions

361

and have small areas. Let T P , F P , T N , F N represent true positive, false positive, true negative and false neg-

17

ative, respectively. Then, sensitivity (SE), specificity (SP), precision (PR), recall (RC), true positive rate (TPR), false positive rate (FPR) and dice score (DSC) are computed as follows, respectively:

SE =

TN TP 2T P TP , SP = ,PR = , DSC = TP + FN TN + FP TP + FP 2T P + F P + F N

(7)

T P R = RC = SE, F P R = 1 − SP 362

Because of the partial volume effects in retinal images, the outer boundaries of many lesion

363

regions are fuzzy and cannot be well determined. They might be lost in the detection or even

364

in the manual labeling. To compensate for this defect, we will compute T P , F P , T N , F N as

365

those in [35]. In [35], if a detected pixel is next to a manually labeled pixel, it will be considered

366

as a true positive. Thus, manually labeled lesion regions are expanded spatially by the radius

367

of two pixels, and new pixels are still considered as true positives. Meanwhile, the identified

368

abnormal regions are also expanded spatially with the radius of two pixels, and new pixels are

369

still considered as abnormal. T P , F P , T N , F N are computed based on the expanded regions.

Figure 10: (a)-(b) ROC curve and PR curve of the proposed method. They are computed based on retinal abnormality detection results in the data sets H and MM.

370

5.2. Results

371

In experiments, the three parameters m, n and the threshold T select the following constant

372

values, respectively: m = 8, n = 6 and T = 20. However, Section 6.3 will show that their

373

selection has the robustness. 18

Figure 11: (a)-(r) different retinal images containing various lesions, where different lesions may have different shapes, sizes, colors, textures and positions. (a1)-(r1) retinal abnormality detection results from the different retinal images in (a)-(r) by the proposed method.

374

1) Quantitative evaluation by the ground truth: Based on the data sets H and MM, the

375

proposed method is quantitatively evaluated. While more than 26 different types of lesions

376

are identified from H, 6 different types of lesions are identified from MM. Additionally, the

377

ROC curve and precision-recall (PR) curve for retinal abnormality detection are computed

378

respectively over H and MM, and shown in Fig. 10. Fig. 10 shows that, over the both data

379

sets, the proposed method achieves high AUC (the area under the ROC curve) of 0.9921 and

380

0.9971, and high MAP (the area under the PR curve) of 0.8694 and 0.8688. The quantitative

381

analysis in Fig. 10 shows that the proposed method is effective for detecting a variety of lesions

382

from retinal images, even if the types and the number of lesions in these images are unknown 19

383

and different lesions exhibit diverse visual features. Dice scores are also computed over H and

384

MM for evaluating retinal abnormality segmentation, which are 0.80 and 0.81, respectively.

385

Retinal lesions usually have small areas, and thus their small segmentation errors will lead to

386

large variations of dice scores.

387

2) Some typical examples: The proposed method is applied to various retinal images to

388

test its effectiveness and feasibility. Since abnormal regions in retinal images are complex,

389

experimental results are provided in three different cases, respectively.

390

Different retinal images may contain different types of lesions, and different lesions usually

391

have distinct visual features (e.g., shapes, sizes, colors, textures and positions), as shown in Figs.

392

1 and 11(a)-(r). By the proposed method, however, different types of lesions can be well detected

393

from different retinal images, as illustrated in Figs. 11(a1)-(r1). In particular, even if some

394

retinal images contain a large number of small lesions (see Figs. 11(d)(g)(h)(m)(r)), these small

395

lesions can also be well detected from retinal images, as shown in Figs. 11(d1)(g1)(h1)(m1)(r1).

396

Some retinal images may contain both large and small lesions, as shown in Figs. 11(b)(f)(h)(i)

397

(j)(n)(p)(r). By the proposed method, both of them can be well detected although they have

398

significantly different sizes, as illustrated in Figs. 11(b1)(f1)(h1)(i1)(j1)(n1)(p1)(r1).

399

A retinal image may contain multiple types of lesions with different visual features, as

400

shown in Figs. 12(a)-(c). However, the types and the number of lesions are usually unknown in

401

advance. By the proposed method, different types of lesions can be well detected from a retinal

402

image although these lesions have significantly different visual features, as illustrated in Figs.

403

12(a1)-(c1)). In Figs. 12(a1)-(c1), 6 different types of lesions are detected from each retinal

404

image.

405

5.3. Comparisons

406

5.3.1. Quantitative comparisons with related methods

407

Based on two data sets H and MM, the proposed method is quantitatively compared with

408

two popular CAD methods for retinal lesion detection, and four methods that can be applied

409

to detect retinal abnormality. These six methods include:

410

The U-net deep learning network in [6] (UNET2015) and fast convolutional neural network

411

in [5] (TMI2016), which are trained and then tested over each data set with the 5-fold cross 20

Figure 12: (a)-(c) three retinal images with 6 types of different lesions (dot haemorrhages, block haemorrhages, microaneurysms, dot exudates, block exudates, cotton wool spots). (a1)-(c1) retinal abnormality detection results from each retinal image by the proposed method.

412

validation. The method in [4] (MICCAI2013), which are used to reconstruct patches of F

413

by the dictionary learned from small patches of Ψ and detect retinal lesions by reconstruction

414

errors. The average model based method in [7] (CMIG2013), which can be used to detect retinal

415

lesions by computing the residual between FBL and the average model computed from ΨB . The

416

method in [8] (NEUCOM2018), which reconstructs the retinal background by orthogonal basis

417

vectors learned from ΨB by the PCA analysis. The method in [9] (TMI2019), which computes

418

the retinal background by weakly supervised learning.

419

The seven methods are evaluated quantitatively by two data sets H and MM, and their

420

ROC curves, precision-recall curves, AUC values and MAP values are shown in Fig. 13 and Fig.

421

14, respectively. Figs. 13 demonstrates that, the proposed method significantly outperforms

422

the other six methods when the seven different methods are used to detect various lesions from

423

retinal images in H. Fig. 14 illustrates that, over the data MM, the proposed method also

424

has better performance than the five methods in [4, 5, 6, 7, 8]. The proposed method and

425

the method in [9] show their respective advantages on different sets. However, the proposed

426

method has stable performance over two different sets H and MM, and can perform better 21

Figure 13: (a)-(b) ROC curves and PR curves of the 7 different methods (Note, MICCAI2013 [4] has 5 different cases in which patch sizes are different). They are generated based on detection results of various abnormal regions from the same data set H. (c) AUC and MAP values of the 7 different methods, computed from ROC and PR curves in (a)-(b). The 6 MAP values are sorted as follows: 0.8694 (our algorithm), 0.8394 (TMI2019 [9]), 0.8221 (UNET2015 [6]), 0.6519(NEUCOM2018 [8]), 0.6216 (TMI2016 [5]), 0.4295(CMIG2013 [7]), 0.1264, 0.1267, 0.1358, 0.1277, 0.1365 (MICCAI2013 [4]), respectively.

427

than the TMI2019 method in more complex data set, such as H.

428

The method in [7] actually provides the same retinal background for different retinal test

429

images. In [4, 8], orthogonal basis vectors or the dictionary are learned from ΨB or their small

430

patches, and the encoded result of the test image, which is similar as FBH , is approximately

431

regarded as the IRB. In [9], both lesions and normal individual differences can be regarded as

432

abnormal regions due to their random characteristics, and thus the learned IRB might be too

433

smoothed. In this paper, however, the IRB is finely learned, and it can have a high accuracy

434

and contain almost no lesions.

435

5.3.2. Quantitative comparisons with background image learning methods

436

After the preprocessing in this paper, some background image learning methods developed

437

for video image analysis or time series image analysis might also be used for retinal background

438

learning from ΨB . Thus, the proposed approach is particularly compared with three recently

439

developed background image learning methods in [36, 37, 38] (ICML2014 [36], AAAI2013 [37],

440

ICCV2015 [38]). Based on the data set H, we quantitatively evaluate the four methods, and 22

Figure 14: (a)-(b) ROC curves and PR curves of the 7 different methods (Note, MICCAI2013 [4] has 5 different cases in which patch sizes are different). They are generated based on detection results of various abnormal regions from the same data set MM. (c) AUC and MAP values of the 6 different methods, computed from ROC and PR curves in (a)-(b). The 6 MAP values are sorted as follows: 0.9091 (TMI2019 [9]), 0.8688 (our algorithm), 0.8676 (UNET2015 [6]), 0.8459(NEUCOM2018 [8]), 0.6947 (TMI2016 [5]), 0.4669(CMIG2013 [7]), 0.0831, 0.0683, 0.0786, 0.0727, 0.0855 (MICCAI2013 [4]), respectively.

441

their ROC curves, precision-recall curves, AUC values and MAP values are shown in Fig. 15,

442

respectively. Fig. 15 shows that, the proposed algorithm has better performance in retinal

443

background learning than the other three methods. Table 1: Means and standard deviations of 10 different methods over different subsets of the data set H.

TMI2019 [9] UNET2015 [6] NEUCOM2018 [8] TMI2016 [5] CMIG2013 [7] MICCAI2013(4*4) [4] ICML2014 [36] AAAI2013 [37] ICCV2015 [38] Our algorithm

Mean AUC 0.9912 0.9874 0.9794 0.9172 0.8626 0.8140 0.9896 0.9867 0.9390 0.9918

Std AUC 0.0037 0.0041 0.0039 0.0185 0.0385 0.0112 0.0039 0.0042 0.0085 0.0028

Mean MAP 0.8429 0.8237 0.6241 0.6291 0.4348 0.1357 0.8260 0.7977 0.2903 0.8711

444

445

5.3.3. Quantitative comparisons of robustness over subsets of H 23

Std MAP 0.0449 0.0478 0.0383 0.0796 0.0789 0.0300 0.0508 0.0459 0.0216 0.0155

Figure 15: (a)-(b) ROC curves and PR curves of the four different methods. They are generated based on detection results of various abnormal regions from the same data set H. The four MAP values are sorted as follows: 0.8694 (our algorithm), 0.8379(ICML2014 [36]),0.8068 (AAAI2013 [37]),0.2827 (ICCV2015 [38]), respectively.

446

We also evaluate detection abilities and robustness of the above 10 different methods over

447

different subsets of the data set H. A 5-fold cross validation is performed for the two deep

448

learning methods, UNET2015 and TMI2016. For the other eight methods (including the pro-

449

posed approach, TMI2019, NEUCOM2018, CMIG2013, MICCAI2013, ICML2014, AAAI2013,

450

and ICCV2015), the data set H is divided into five test subsets, and the AUC and MAP values

451

on each test subset are calculated, respectively. Consequently, the standard deviations of the

452

AUC and MAP for 10 different methods over the five different test subsets can be computed.

453

They are shown in Table 1. Table 1 shows that, comparing with other 9 different methods,

454

the proposed approach not only have a high MAP mean value, but also is robust over different

455

subsets of H.

456

6. Discussions

457

6.1. On the preprocessing

458

The preprocessing applied to all retinal images not only suppresses retinal imaging bias and

459

individual differences of blood vessels, but also enables to compare similar anatomical regions

460

in all different retinal images. The importance of the preprocessing has been quantitatively

461

evaluated on the data set H. Fig. 16(c) shows that, based on the removal of blood vessels,

462

the MAP value of the proposed approach for learning the IRB can be enhanced from 0.6025 to

463

0.8694. Fig. 13(b)(MICCAI2013) and Fig. 16(a)(with different λ cases) show that, by using 24

464

the preprocessing in this paper and learning the dictionary from images in ΨB rather than

465

their small patches, the reconstruction-error based technique can enhance its MAP value from

466

0.1358 to 0.7023. Hence, the preprocessing plays a necessary and important role to enhance

467

retinal abnormality detection.

Figure 16: (a) ROC and Precision-Recall (PR) curves in different cases. Our algorithm: by the MSSCL algorithm plus the repeated learning strategy; Single Series: by the MSSCL algorithm only; λ = i: by FBH with λ = i. (b) ROC and PR curves corresponding to different thresholds T . (c) ROC and PR curves in three different preprocessing cases: without color normalization (CN) and blood vessel suppression (BVS), with only CN, and with both CN and BVS.

468

6.2. On the MSSCL algorithm and the repeated learning strategy

469

FBH , FU and FP all can be regarded as approximations of the IRB. In what follows, based on

470

quantitative evaluations on the data set H, we show their differences, and explain the necessity

471

and importance of applying the MSSCL algorithm and the repeated learning strategy.

472

For any single λ, the corresponding FBH represents only one rough background. It either

473

has a low approximation accuracy, or has a high accuracy but contains many lesion regions.

474

Thus, diverse retinal abnormalities can not be well detected by the residual (or reconstruction

475

error) |FBH − FBL | no matter what value is selected for λ. This is illustrated in Fig. 16(a).

476

1 1 1 Fig. 16(a) shows that, no matter what value is selected for λ (such as 2, 1, 12 , 14 , 81 , 16 , 32 , 64 ),

477

the corresponding precision-recall curves of the residuals |FBH − FBL | all have comparatively 25

478

small MAP values, being smaller than 0.7023.

479

The multi-scale sparse coding results of FBL , i.e., Θrough , however, together can provide

480

complementary information for learning the IRB. They not only contain high accuracy normal

481

regions of FBL , but also provide possible normal regions to replace lesion regions. Thus, com-

482

paring with any rough background FBH , the IRB learned by the MSSCL algorithm will have

483

a much higher approximation accuracy and contain much less lesion regions. This is demon-

484

strated in Fig. 16(a). Fig. 16(a) shows that, by the MSSCL algorithm, the MAP value can be

485

enhanced to 0.8391, far better than the ones detected by any FBH .

486

Salient lesions in FBL will have diffusion in some rough backgrounds with small λ, and

487

therefore disturb the IRB learned by the MSSCL algorithm. By using the repeated learning

488

strategy, i.e., by repeatedly learning the IRB from the modified FBL in which known salient

489

lesions have been suppressed, the effects of salient lesions can be well suppressed from the

490

learned IRB, and the approximation accuracy of the learned IRB can be further improved. In

491

such case, more weak lesions can be detected, and more weak false positives can be suppressed.

492

This is demonstrated in Fig. 16(a). In Fig. 16(a), by using the repeated learning strategy, the

493

MAP value can be further enhanced to 0.8694.

Figure 17: (a)-(b) ROC curves and PR curves of the proposed method, corresponding to different values of the parameter n.

494

6.3. On the robustness of parameters

495

In the proposed method, there are three important parameters: T , m, n. Based on the

496

set H, we quantitatively evaluate the effects of their different selections on the lesion detection 26

497

results. A single control variable method is used, in which only one of the three parameters T ,

498

m, n is changed, and the other two parameters remain fixed. Here, the fixed parameters will

499

select the constant values: m = 8, n = 6 and T = 20. In Fig. 16(b), Fig. 17 and Fig. 18,

500

the precision-recall curves corresponding to different thresholds (10, 12, 14, 16, 18, 20, 22, 24, 26),

501

different m (6, 7, 8, 9, 10) and different n (2, 4, 6, 8, 10) are shown, respectively. The heuristic

502

study in Fig. 16(b), Fig. 17 and Fig. 18 shows that, when a parameter is arbitrarily given

503

different values from a comparatively large range, the final AUC and MAP values remain nearly

504

unchanged at different values, showing the robustness of the three parameters. Therefore, we

505

fix T = 20, m = 8, n = 6 to segment lesions and other abnormal regions from all retinal images.

Figure 18: (a)-(b) ROC curves and PR curves of the proposed method, corresponding to different values of the parameter m.

506

7. Conclusion

507

Detecting a diverse range of abnormal regions from retinal images is a key step towards

508

automatic screening of retinal diseases, and yet is a challenging task due to the reasons including

509

(1) a retinal image may contain different lesions with a variety of shapes, sizes, colors, and

510

textural patterns; (2) the number and the type of retinal lesions in a retinal image are generally

511

unknown in advance.

512

This paper addresses these challenges with a novel and effective method. Our method learns

513

individualized retinal background (IRB), i.e., the characteristics of normality, for detection and

514

separation of abnormal regions of different types and numbers. This is considered fundamentally 27

515

different from the current disease-specific CAD techniques that detect pre-defined type of lesions

516

based on learning characteristics of the given type of abnormality from lesion samples or lesion

517

features.

518

IRB is learned from normal retinal images by a novel multi-scale sparse coding based learning

519

(MSSCL) of the retinal test image. From this multiscale sparse encoding with different levels of

520

sparse regularization, an adaptive dictionary is subsequently constructed to embrace different

521

levels of background details, which forms a unified background space containing more complete

522

information for IRB learning. Low-rankness is inherently contained in this background space,

523

and the IRB can be well learned by RPCA from the background space. The IRB learned by

524

MSSCL thus can achieve an accuracy far higher than the usual method that learns IRB by

525

directly encoding sparsely retinal images. Some of multi-scale sparse coding results, however,

526

can partially encode information of salient lesions. Therefore, MSSCL is repeatedly applied

527

to the modified test image in which salient lesions are suppressed, so as to further refine the

528

multiscale sparse coding and the learned IRB.

529

The experiments in this paper demonstrate that it is feasible to detect diverse retinal ab-

530

normal regions from a retinal image through the refined learning of its individualized IRB from

531

normal retinal images. Our further research will continue to distinguish some weak lesions from

532

normal individual differences in retinal images, and to learn patterns of different types of the

533

detected retinal abnormal regions to provide a unified learning framework for diagnoses of a

534

variety of lesions. Main ideas in this paper might be used for reference or may even be effective

535

for detecting various abnormal regions in some other medical images. Thus, we will also explore

536

the possibility of extending the proposed method to soother medical images. .

537

Acknowledgment

538

This work was supported in part by NSFC of China (61375020, 61572317), Shanghai Intel-

539

ligent Medicine Project (2018ZHYL0217), and SJTU Translational Medicine Cross Research

540

Fund (ZH2018QNA05).

28

541

References

542

[1] G. Quellec, M. Lamard, A. Erginay, A. Chabouis, P. Massin, B. Cochener, G. Cazuguel,

543

Automatic detection of referral patients due to retinal pathologies through data mining,

544

Medical Image Analysis 29 (2016) 47–64.

545

[2] M. R. K. Mookiah, U. R. Acharya, C. K. Chua, C. M. Lim, E. Ng, A. Laude, Computer-

546

aided diagnosis of diabetic retinopathy: A review, Computers in Biology and Medicine

547

43 (12) (2013) 2136–2155.

548

[3] U. Schmidt-Erfurth, A. Sadeghipour, B. S. Gerendas, S. M. Waldstein, H. Bogunovic,

549

Artificial intelligence in retina, Progress in Retinal and Eye Research 67 (2018) 1–29.

550

[4] N. Weiss, D. Rueckert, A. Rao, Multiple sclerosis lesion segmentation using dictionary

551

learning and sparse coding, in: International Conference on Medical Image Computing

552

and Computer-Assisted Intervention, Springer, 2013, pp. 735–742.

553

[5] M. J. van Grinsven, B. van Ginneken, C. B. Hoyng, T. Theelen, C. I. S´anchez, Fast convo-

554

lutional neural network training using selective data sampling: Application to hemorrhage

555

detection in color fundus images, IEEE Transactions on Medical Imaging 35 (5) (2016)

556

1273–1284.

557

[6] O. Ronneberger, P. Fischer, T. Brox, U-net: convolutional networks for biomedical image

558

segmentation, in: International Conference on Medical Image Computing and Computer-

559

Assisted Intervention, Springer, Munich, Germany, 2015, pp. 234–241.

560

[7] S. Ali, D. Sidib´e, K. M. Adal, L. Giancardo, E. Chaum, T. P. Karnowski, F. Meriaudeau,

561

Statistical atlas based exudate segmentation, Computerized Medical Imaging and Graphics

562

37 (5) (2013) 358–368.

563

[8] B. Chen, L. Wang, J. Sun, H. Chen, Y. Fu, S. Lan, Y. Huang, Z. Xu, Diverse lesion

564

detection from retinal images by subspace learning over normal samples, Neurocomputing

565

297 (2018) 59–70.

29

566

567

568

569

570

571

[9] R. Wang, B. Chen, D. Meng, L. Wang, Weakly-supervised lesion detection from fundus images, IEEE Transactions on Medical Imaging 38 (6) (2019) 1501–1512. [10] I. Lazar, A. Hajdu, Retinal microaneurysm detection through local rotating cross-section profile analysis, IEEE Transactions on Medical Imaging 32 (2) (2013) 400–407. [11] M. U. Akram, S. Khalid, S. A. Khan, Identification and classification of microaneurysms for early detection of diabetic retinopathy, Pattern Recognition 46 (1) (2013) 107–116.

572

[12] D. Sidib, I. Sadek, F. Mriaudeau, Discrimination of retinal images containing bright lesions

573

using sparse coded features and svm, Computers in Biology and Medicine 62 (2015) 175–

574

184.

575

576

[13] S. S. Kar, S. P. Maity, Automatic detection of retinal lesions for screening of diabetic retinopathy, IEEE Transactions on Biomedical Engineering 65 (3) (2018) 608–618.

577

[14] L. B. Frazao, N. Theera-Umpon, S. Auephanwiriyakul, Diagnosis of diabetic retinopathy

578

based on holistic texture and local retinal features, Information Sciences 475 (2019) 44–66.

579

[15] V. Gulshan, L. Peng, M. Coram, M. C. Stumpe, D. Wu, A. Narayanaswamy, S. Venu-

580

gopalan, K. Widner, T. Madams, J. Cuadros, Development and validation of a deep learn-

581

ing algorithm for detection of diabetic retinopathy in retinal fundus photographs, JAMA

582

316 (22) (2016) 2402–2411.

583

584

585

586

587

588

[16] G. Quellec, K. Charrire, Y. Boudi, B. Cochener, M. Lamard, Deep image mining for diabetic retinopathy screening, Medical Image Analysis 39 (2017) 178–193. [17] M. D. Abr`amoff, M. K. Garvin, M. Sonka, Retinal imaging and image analysis, IEEE reviews in Biomedical Engineering 3 (2010) 169–208. [18] K. S. Deepak, J. Sivaswamy, Automatic assessment of macular edema from color retinal images, IEEE Transactions on Medical Imaging 31 (3) (2012) 766–776.

589

[19] T. Aokia, Y. Yamashita, K. Yamamoto, Y. Korogi, Usefulness of computerized method for

590

lung nodule detection on digital chest radiographs using similar subtraction images from

591

different patients, European Journal of Radiology 81 (5) (2012) 1062–1067. 30

592

[20] G. Erus, E. I. Zacharaki, C. Davatzikos, Individualized statistical learning from medical

593

image databases: application to identification of brain lesions, Medical Image Analysis

594

18 (3) (2014) 542–554.

595

[21] K. S. Deepak, N. V. K. Medathati, J. Sivaswamy, Detection and discrimination of disease-

596

related abnormalities based on learning normal cases, Pattern Recognition 45 (10) (2012)

597

3707–3716.

598

599

[22] B. Dai, X. Wu, W. Bu, Optic disc segmentation based on variational model with multiple energies, Pattern Recognition 64 (2017) 226–235.

600

[23] R. Kamble, M. Kokare, G. Deshmukh, F. A. Hussin, F. Meriaudeau, Localization of optic

601

disc and fovea in retinal images using intensity based line scanning analysis, Computers

602

in Biology and Medicine 87 (1) (2017) 382–396.

603

604

605

606

607

608

609

610

611

612

613

614

615

616

[24] R. J. Radke, S. Andra, O. Al-Kofahi, B. Roysam, Image change detection algorithms: a systematic survey, IEEE Transactions on Image Processing 14 (3) (2005) 294–307. [25] X. Wang, X. Jiang, J. Ren, Blood vessel segmentation from fundus image by a cascade classification framework, Pattern Recognition 88 (2019) 331–341. [26] A. Telea, An image inpainting technique based on the fast marching method, Journal of Graphics Tools 9 (1) (2003) 25–36. [27] J. Mairal, F. Bach, J. Ponce, G. Sapiro, Online learning for matrix factorization and sparse coding, Journal of Machine Learning Research 11 (1) (2010) 19–60. [28] Y. Yuan, X. Feng, X. Lu, Structured dictionary learning for abnormal event detection in crowded scenes, Pattern Recognition 73 (2018) 99–110. [29] M. Elad, Sparse and redundant representations: From theory to applications in signal and image processing, Springer, New York, 2010. [30] Z. Lu, L. Wang, Noise-robust semi-supervised learning via fast sparse coding, Pattern Recognition 48 (2) (2015) 605–612.

31

617

[31] Y. Fu, C. Wang, Y. Wang, B. Chen, Q. Peng, L. Wang, Automatic detection of longitudinal

618

changes for retinal fundus images based on low-rank decomposition, Journal of Medical

619

Imaging and Health Informatics 8 (2) (2018) 284–294.

620

621

[32] X. Li, Y. Ma, J. Wright, Robust principal component analysis?, Journal of the ACM 58 (3) (2011) 11–48.

622

[33] H. Yong, D. Meng, W. Zuo, L. Zhang, Robust online matrix factorization for dynamic

623

background subtraction, IEEE Transactions on Pattern Analysis and Machine Intelligence

624

40 (7) (2017) 1726–1740.

625

626

[34] Y. W. Wen, R. H. Chan, A. M. Yip, A primal-dual method for total-variation-based wavelet domain inpainting, IEEE Transactions on Image Processing 21 (1) (2011) 106–114.

627

[35] T. Walter, J.-C. Klein, P. Massin, A. Erginay, A contribution of image processing to the

628

diagnosis of diabetic retinopathy-detection of exudates in color fundus images of the human

629

retina, IEEE Transactions on Medical Imaging 21 (10) (2002) 1236–1243.

630

[36] Q. Zhao, D. Meng, Z. Xu, W. Zuo, L. Zhang, Robust principal component analysis with

631

complex noise, in: International Conference on Machine Learning, IEEE, Beijing, China,

632

2014, pp. 55–63.

633

[37] D. Meng, Z. Xu, L. Zhang, J. Zhao, A cyclic weighted median method for l1 low-rank

634

matrix factorization with missing entries, in: AAAI, AAAI Press, Bellevue, Washington,

635

2013, pp. 704–710.

636

[38] X. Cao, Y. Chen, Q. Zhao, D. Meng, Y. Wang, D. Wang, Z. Xu, Low-rank matrix factor-

637

ization under general mixture noise distributions, in: IEEE International Conference on

638

Computer Vision, IEEE, Santiago, Chile, 2015, pp. 1493–1501.

32

639

Benzhi Chen received the B.S. degree in Mechanical Manufacturing and Automation from

640

Xiangfan University in 2010, and the M.S. degree in Mechanical Engineering from Hunan Uni-

641

versity in 2013 and Ph.D. degree in Image Processing and Pattern Recognition in Shanghai Jiao

642

Tong University in 2018. His main interests include medical image analysis, computer vision

643

and machine learning.

644

645

Lisheng Wang received the M.S. degree in mathematics and the Ph.D. degree in elec-

646

tronic and information engineering from Xi’an Jiaotong University, China, in 1993 and1999,

647

respectively. In 2003, he joined Department of Automation, Shanghai Jiao Tong University,

648

China, and now is a Professor. His research interests include analysis and visualization of 3D

649

biomedical images, computer-aided imaging diagnosis and surgery planning.

650

651

Xiuying Wang received the Ph.D. degree in computer science from The University of

652

Sydney, Camperdown, NSW, Australia. She is currently associate professor and Associate Di-

653

rector in the Multimedia Lab, School of Information Technologies, The University of Sydney.

654

Her research interests include biomedical data computing and visual analytics, biomedical im-

655

age registration, identification, clustering and segmentation.

656

657

Jian Sun received the B.S. degree from the University of Electronic Science and Technology

658

of China in 2003 and the Ph.D. degree in applied mathematics from Xian Jiaotong University

659

in 2009. He worked as a visiting student in Microsoft Research Asia from November 2005 to

660

March 2008, a postdoctoral researcher in University of Central Florida from August 2009 to

661

April 2010, and a postdoctoral researcher in willow project team of ’Ecole Normale Sup’erieure

662

de Paris and INRIA from Sept. 2012 to August 2014. He now serves as a professor in the school

663

of mathematics and statistics of Xi’an Jiaotong University. His current research interests are

664

image processing, medical image analysis and machine learning.

665

666

Yijie Huang received his B.S. degree in Automation from Huazhong University of Science

667

and Technology in 2015. He is now a Ph.D. candidate in image processing and pattern recog-

33

668

nition in Shanghai Jiao Tong University. His main interests include deep learning and medical

669

image analysis.

670

671

David Feng received the Ph.D. degree in computer science from the University of Califor-

672

nia, Los Angeles, CA, USA, in 1988. He is currently the Director (Research) with the Institute

673

of Biomedical Engineering & Technology, and an Academic Director with USYD-SJTU Joint

674

Research Alliance. He has been the Head in the School of Information Technologies, Faculty

675

of Engineering and Information Technologies, and an Associate Dean of the Faculty of Science,

676

University of Sydney, Camperdown, NSW, Australia. He has been the Chair Professor, Advi-

677

sory Professor, Guest Professor, Adjunct Professor, or Chief Scientist in different world-known

678

universities and institutes. He is the Founder and Director of the Biomedical & Multimedia

679

Information Technology Research Group at the University of Sydney. Dr. Feng has served as

680

Chair or Editor of different committees and key journals. He has been elected as a Fellow of

681

ACS (Australia), HKIE (Hong Kong), IET (UK), IEEE (USA), and the Australian Academy

682

of Technological Sciences and Engineering.

683

684

Zongben Xu received the M.S. degree and Ph.D. degree in mathematics from Xi’an Jiao-

685

tong University, China, in 1981 and 1987, respectively. He has been with Xi’an Jiaotong

686

University since 1982, and was promoted to Professor in 1991. He is the Academician of the

687

Chinese Academy of Sciences. His current research interests include intelligent information

688

processing, computational vision and machine learning.

34