Computer-aided gastrointestinal hemorrhage detection in wireless capsule endoscopy videos

Computer-aided gastrointestinal hemorrhage detection in wireless capsule endoscopy videos

Accepted Manuscript Title: Computer-Aided Gastrointestinal Hemorrhage Detection in Wireless Capsule Endoscopy Videos Author: Ahnaf Rashik Hassan Moham...

7MB Sizes 2 Downloads 82 Views

Accepted Manuscript Title: Computer-Aided Gastrointestinal Hemorrhage Detection in Wireless Capsule Endoscopy Videos Author: Ahnaf Rashik Hassan Mohammad Ariful Haque PII: DOI: Reference:

S0169-2607(15)00232-1 http://dx.doi.org/doi:10.1016/j.cmpb.2015.09.005 COMM 3974

To appear in:

Computer Methods and Programs in Biomedicine

Received date: Revised date: Accepted date:

5-5-2015 27-8-2015 1-9-2015

Please cite this article as: Ahnaf Rashik Hassan, Mohammad Ariful Haque, Computer-Aided Gastrointestinal Hemorrhage Detection in Wireless Capsule Endoscopy Videos, Computer Methods and Programs in Biomedicine (2015), http://dx.doi.org/10.1016/j.cmpb.2015.09.005 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Computer-Aided Gastrointestinal Hemorrhage Detection in Wireless Capsule Endoscopy Videos

ip t

Ahnaf Rashik Hassan, Mohammad Ariful Haque∗

pt

ed

M

an

us

cr

Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh.

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65



Corresponding Author: Mohammad Ariful Haque, Phone: +8801864977369, Email: [email protected] Preprint submitted to Computer Methods and Programs in Biomedicine

July 22, 2015

Page 1 of 70

Computer-Aided Gastrointestinal Hemorrhage Detection in Wireless Capsule Endoscopy Videos

ip t

Ahnaf Rashik Hassan, Mohammad Ariful Haque∗

cr

Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh.

us

Abstract

Background and objective: Wireless Capsule Endoscopy (WCE) can image

an

the portions of the human gastrointestinal tract that were previously unreachable for conventional endoscopy examinations. A major drawback of

M

this technology is that a large volume of data are to be analyzed in order to detect a disease which can be time-consuming and burdensome for the clinicians. Consequently, there is a dire need of computer-aided disease detection

ed

schemes to assist the clinicians. In this paper, we propose a real-time, computationally efficient and effective computerized bleeding detection technique

pt

applicable for WCE technology.

Methods: The development of our proposed technique is based on the observation that characteristic patterns appear in the frequency spectrum

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

of the WCE frames due to the presence of bleeding region. Discovering these discriminating patterns, we develop a texture-feature-descriptor-basedalgorithm that operates on the Normalized Gray Level Co-occurrence Matrix (NGLCM) of the magnitude spectrum of the images. A new local texture de∗

Corresponding Author: Mohammad Ariful Haque, Phone: +8801864977369, Email: [email protected] Preprint submitted to Computer Methods and Programs in Biomedicine August 22, 2015

Page 2 of 70

scriptor called difference average that operates on NGLCM is also proposed. We also perform statistical validation of the proposed scheme. Results: The proposed algorithm was evaluated using a publicly available

ip t

WCE database. The training set consisted of 600 bleeding and 600 nonbleeding frames. This set was used to train the SVM classifier. On the

cr

other hand, 860 bleeding and 860 non-bleeding images were selected from the

rest of the extracted images to form the test set. The accuracy, sensitivity

us

and specificity obtained from our method are 99.19%, 99.41% and 98.95% respectively which are significantly higher than state-of-the-art methods. In

an

addition, the low computational cost of our method makes it suitable for real-time implementation.

M

Conclusion: This work proposes a bleeding detection algorithm that employs textural features from the magnitude spectrum of the WCE images. Experimental outcomes backed by statistical validations prove that the pro-

ed

posed algorithm is superior to the existing ones in terms of accuracy, sensitivity, specificity and computational cost.

pt

Keywords: Wireless Capsule Endoscopy (WCE), Bleeding Detection, Support Vector Machine, Normalized Gray Level Co-occurrence Matrix.

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

1. Introduction

The classical endoscopic procedure has enabled clinicians to investigate

the human gastro-intestinal (GI) tract. Despite being efficacious for the upper (duodenum, stomach and food pipe) and lower part (colon and terminal ileum) of the GI tract, the traditional endoscopy miserably fails to examine the small intestine. The human small intestine is about 8 meters 2

Page 3 of 70

long and conventional endoscopy such as Colonoscopy or Esophagogastroduodenoscopy cannot image it satisfactorily. To overcome the limitations of traditional endoscopy, G. Iddan et. al. [1] pioneered the invention of wire-

ip t

less capsule endoscopy (WCE). The WCE system consists of a pill-shaped

capsule. The capsule has a built-in video camera, video signal transmitter,

cr

light-emitting diode and a battery. It is swallowed by the patient and is

propelled forward by peristalsis of human GI tract. It records images as it

us

moves forward along the GI tract and transmits them at the same time using radio frequency. It transmits over the course of about 8 hours until its

an

battery runs out. Due to its promising performance for the visualization of human GI tract, U.S. Food and Drug Administration (FDA) approved it in

1.1. Problem Description

M

2001 [2].

ed

Manual classification of bleeding and non-bleeding endoscopic video frames has a number of limitations. The power supply of the capsule has limitations which result in low resolution (576 × 576) of endoscopic video frames. The

pt

video frame rate is also low (2 frames/second). Besides, about 60,000 images have to be inspected per examination. It takes an experienced clinician about

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

2 hours which may not be pragmatic in most clinical scenarios. Since the evaluation process is time-consuming and a large volume of images have to be inspected, bleeding detection becomes more subject to human error. So a computer-aided detection (CAD) of bleeding frames can make this monumental task easy for clinicians. Given Imaging Ltd. [3] designed a software called Suspected Blood Indicator (SBI) for automatic detection of bleeding frames. But SBI demonstrates 3

Page 4 of 70

poor sensitivity and specificity and often fails to detect any kind of bleeding other than that of the small intestine [4]. The software designed by Given Imaging Ltd. allows the physician to view two consecutive frames at the same

ip t

time. But due to low frame rate, two consecutive frames may not contain the area of interest. Consequently, the clinician has to toggle between images

cr

making the evaluation process even more onerous and time-consuming. All

the aforementioned problems of manual screening can be eliminated by the

us

use of CAD.

an

1.2. Related Work

The previous works on GI hemorrhage detection can roughly be classified as : color based, texture based and color and texture based methods.

M

Color based methods [5] [6] [7] [8] basically exploit the ratios of the intensity values of the images in the RGB or HSI domain. Texture based approaches

ed

attempt to utilize the textural content of bleeding and non-bleeding images to perform classification [9] [10] [11]. It has been reported that the combination of color and texture descriptors exhibit good performance in terms of

pt

accuracy [12]. Again, depending on the region of operation, CAD bleeding and tumor detection literature can be categorized into three groups -whole

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

image based [12] [13] [14] [15], pixel based [5] [6] [16] and patch based methods [9] [17] as in [7]. Whole image based methods are fast but often fail to detect small bleeding regions. Pixel based methods have to operate on each pixel of the image to generate the feature vectors. As a result, they are computationally very expensive. It can be expected that patch based methods will achieve good accuracy while keeping the computational cost low. However, patch based methods show high sensitivity but low specificity 4

Page 5 of 70

and accuracy. Besides, informative patches need to be identified manually by a clinician which hinders the idea of making the whole process automatic. B. Li and Q. Meng [9] put forward a chrominance moment and Uniform

ip t

Local Binary Pattern (ULBP) based solution to bleeding detection. Yanan

Fu et al. [7] came up with a super-pixel and red ratio based solution that was

cr

promising in terms of accuracy. But it was reported that this method has a high computational cost and fails to detect images with poor illumination and

us

minor angiodysplasia regions whose hue is similar to normal tissue. Hwang et al. [16] utilized Expectation Maximization Clustering algorithm for CAD

an

of bleeding frames. Some prior works [10] [11] employed MPEG-7 based visual descriptors to identify medical events such as blood, ulcer and Crohn’s

M

disease lesions. Pan et al. [5] formed a 6-D feature vector using R,G,B,H,S,I values and used probabilistic neural network (PNN) as classifier. Liu et al. [6] proposed Raw, Ratio and Histogram feature vectors which are basically the

ed

intensity values of the image pixels and used support vector machine (SVM) to detect GI bleeding. Hegenbart et. al. [18] utilized scale invariant wavelet

pt

based texture features to detect Celiac disease in endoscopic videos. Using MPEG-7 based visual descriptors, Bayesian and SVM, Cunha et. al. [19] segmented the GI tract into four major topographic areas and performed

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

image classification. For a more comprehensive review on computer aided decision support system for WCE videos, [20] can be consulted. 1.3. Our Method

In this work, we aim to draw inferences (bleeding or non-bleeding) on the

spatial domain of an image by extracting features in the frequency domain. Fig. 1 depicts the steps of the proposed scheme. At first, we compute Dis5

Page 6 of 70

ip t cr us an M

ed

Figure 1: Block diagram of the proposed framework.

crete Fourier Transform (DFT) of the endoscopic video frames. Afterwards

pt

we take the log transform of the magnitude spectrum of the frames. Normalized Gray Level Co-occurrence Matrix (NGLCM) matrix is then constructed to extract features from each log transformed magnitude spectrum. The se-

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

lected features are computed from NGLCM. The features are then fed into support vector machine classifier to perform classification of the frames. There are significant distinctions between the proposed approach and

previous studies on bleeding detection in the literature. It has also some advantages. Both are described as follows. • To the best of the authors’ knowledge, none of the existing works in 6

Page 7 of 70

the literature on GI bleeding detection attempt to solve the problem in the frequency domain. However, spectral texture descriptors have been used for other image classification problems such as texture classifica-

ip t

tion [21] [22], remote sensing image classification [23], tumor recognition in colonoscopy images [24] etc.

cr

• Most of the state-of-the-art works use either all [5] [7] [8] or any two [6]

us

of the R, G and B channels. An advantage of this method lies in the fact that it uses only one channel. That is any one of the three channels

an

can be used.

• In this work we propose ‘difference average’, a new feature that can be implemented on the NGLCM. The experimental results of this feature

M

are promising.

ed

• Our algorithm shows promising performance to correctly classify tiny bleeding regions.

pt

• The proposed scheme requires less number of features than many of the existing methods such as [15] [17]. Lastly, we conduct our experiments on a large data-set. This ensures reliability and effectiveness of our

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

algorithm in practical scenarios because an algorithm that only works well on a small data-set cannot ensure that it will work in real-world implementations.

One major concern of this approach may be the computational cost as-

sociated with the computation of DFT of the images. But with the development and implementation of fast Fourier transform (FFT) algorithms, one 7

Page 8 of 70

can expect that this approach will not be computationally costly. Besides, the method is non-iterative. These assumptions are later supported by experimental results that show the proposed method is indeed computationally

ip t

inexpensive. The proposed algorithm outperforms the state-of-the-art meth-

ods implemented on the same data-set in accuracy, sensitivity and specificity.

cr

Computer-aided GI bleeding detection is a machine learning problem and has three basic steps- feature generation or extraction, feature selection and

us

classification. Hence, the rest of the article is organized as follows: Section 2 expounds the feature generation part of our algorithm, intuitively describes

an

the reason behind the choice of the proposed feature extraction scheme, provides mathematical formulation of the selected features and introduces the

M

new feature we propose in this work. In Section 3, we statistically prove that the differences of the chosen features are statistically significant. Section 4 describes the classifier we choose in the proposed method. We provide

ed

the details of the experiments conducted to demonstrate the efficacy and superiority of our algorithm, present the experimental results and explicate

pt

their significance in Section 5. Finally, Section 6 presents how this work can further be extended and Section 7 concludes the article.

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

2. Feature Extraction in the Fourier Domain We initiate this section with a brief review of spectral estimation using

the DFT. The reason for choosing the DFT based texture descriptor is then expounded. We then discuss the construction of the NGLCM and provide the mathematical expressions of the features of our algorithm.

8

Page 9 of 70

2.1. Spectral Estimation using the DFT At first we compute the Fourier spectrum of each of the endoscopic video M × N can be expressed as: px

qy

f (x, y)e−j2π( M + N )

cr

F (p, q) =

M −1 N −1 ∑ ∑

ip t

frames. The 2-D Discrete Fourier Transform of a WCE image f (x, y) of size

x=0 y=0

(1)

us

where p = 0, 1, 2, 3, ....., M − 1 and q = 0, 1, 2, 3, ....., N − 1. The frequency spectrum of an image can be obtained from the absolute value of its DFT.

an

The frequency spectrum of an image is a measure of its frequency distribution which can generate patterns depending on the content of the image in the spatial domain. In general, high frequency components indicate sharp

M

transition of intensities and low frequency components indicate intensities with a slow rate of change. In bleeding images, there will be sharp transi-

ed

tions of intensities from bleeding regions to their neighboring non-bleeding regions. These transitions will be absent in the non-bleeding frames. This is precisely why the magnitude spectrum can be useful for this particular

pt

application. Furthermore, log transformation is used with a view to reducing computational cost. An important trait of log transformation is that it

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

compresses the dynamic range of images with large variations in the pixel intensities [25]. Fourier spectrum of an image contains a gargantuan dynamic range of values. Magnitude spectrum values may range from 0 to as large as 107 or even higher. This ensure that the maximum intensity values are not too high so that we can have a GLCM of manageable size. For instance, without applying log transformation if the highest value in the Fourier spectrum was 107 , then the size of each GLCM would have been 107 ×107 making 9

Page 10 of 70

Bleeding cases

ed

Non-bleeding cases

M

an

us

cr

ip t

the algorithm computationally very expensive.

Figure 2: Left two columns: typical non-bleeding WCE frames (top) and the corresponding magnitude spectrums (bottom). Right two columns: typical bleeding WCE frames

pt

(top) and the corresponding magnitude spectrums (bottom). Note the non-horizontal and non-vertical lines appearing in the bleeding spectrum. The proposed method aims to quantify these characteristics to generate discriminating features for the supervised

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

learning algorithm.

2.2. Why DFT-Based Texture Descriptor We know that in signal classification if the transform is suitably chosen,

transform domain features may exhibit more useful classification-related information than the original signal. Keeping that in mind, this work explores to forge a connection between the spatial and the frequency domain of an 10

Page 11 of 70

image to generate meaningful feature descriptors and classify bleeding and non-bleeding frames. It is observed from numerous visual inspections that the frequency spectrum of the bleeding images tend to show straight lines

ip t

near or along the diagonal directions. However, these lines are absent in the non-bleeding frequency spectrum. This observation is illustrated in Fig.

cr

2. The right two columns of Fig. 2 shows typical bleeding WCE frames,

where spectral lines are generated in the ±45◦ directions of the magnitude

us

spectrum. No such patterns appear in non-bleeding frames as we can see from the left two columns of Fig. 2. This phenomenon can be exploited to

an

generate features for the machine learning algorithm.

To capture the above mentioned visual observations mathematically, the

M

most appropriate strategy is to use texture feature descriptors. Texture is essential for both human visual perception and image analysis [26]. However, texture does not have any widely agreed definition. Portilla and Simon-

ed

celli [27] provide a general definition with which many researchers agree: “Loosely speaking, texture images are spatially homogeneous and

pt

consist of repeated elements, often subject to some randomization in their location, size, color, orientation etc.”

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Since, texture features are widely used to detect various texture patterns in images, they are the most appropriate descriptors in the proposed frequency domain based framework. Texture measures are used to quantify the observed visual differences of bleeding and non-bleeding spectra and these measures are utilized to train the classifier.

11

Page 12 of 70

2.3. Local vs. Global Texture Descriptor After computing out the magnitude spectrum of the endoscopic frames, we extract textural features from them. In our work, we employ both global

ip t

and local feature descriptors. Global texture features are simple and widely

used in the image classification literature to measure the overall textural con-

cr

tent of the image. Global features are computed considering the image as a whole whereas local feature descriptors operate on a small region or a few

us

pixels. Since in the GI hemorrhage detection problem the bleeding regions may be small and localized, local feature descriptors are very promising for

an

achieving classification result with high accuracy. Global features are measures of distribution of intensities but they carry no information regarding

M

the relative position of pixels with respect to each other [25]. Considering these pros and cons both global and local feature descriptors are utilized to devise the algorithm. Entropy of the magnitude spectrum is used as global

ed

feature descriptor in this study. Contrast, Sum Entropy, Sum Variance, Difference Variance and Difference Average that operate on the NGLCM are

pt

used to capture local textural information. In other words, our feature set consists of a global feature and five local features.

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

2.4. NGLCM and Haralick Features In order to capture local textural information from the spectrum of the

images, we employ features from the NGLCM of the magnitude spectrum of the WCE frames. The Gray Level Co-occurrence Matrix (GLCM) is an L × L matrix of the input image where L is the number of gray levels of the image. Fig. 3 illustrates the construction of GLCM from an image. If two consecutive pixels of the input image have pixel values i and j, then the 12

Page 13 of 70

ip t cr

Figure 3: Construction of Gray Level Co-occurrence Matrix. (a) Different choices of the

us

Position Operator P (b) An image (c) GLCM of the image. For 0◦ , the values of the blue-marked pixel and its immediate right one are noted. The first row and third column

an

element of the GLCM is incremented. In other words, the value 3 in the first row and third column of the GLCM indicates that pixel values 1 and 3 consecutively occur thrice

M

in the whole image. Image Courtesy [28].

(i, j)th element of the GLCM is incremented. This operation is done for every pair of pixels in the image. In this way, the GLCM is formed. The position

ed

operator P governs how this pixels are related to each other. The effect of P on the detection performance will be discussed later in this paper. In this

pt

work, the performance of Normalized Gray Level Co-occurrence Matrix on the frequency spectrum of WCE images is inspected. The NGLCM can be constructed from the GLCM using the following relation:

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

N (i, j) =

G(i, j) R

(2)

where R is the total number of pixel pairs in the GLCM. In essence, NGLCM maps the image to a matrix that indicates the probability of occurrence of two consecutive pixel values. This implies that NGLCM carries local textural information of the image to be extracted by the features. Besides, NGLCM based texture features have been widely used in various applications 13

Page 14 of 70

for texture analysis and image classification. These two factors motivated the use of NGLCM in the proposed method. Various statistical measures such as mean, moment, entropy etc. are used

ip t

as global texture descriptors [25] to measure the overall textural content of the image. These features operate on the entire image. After conducting

cr

repeated experiments, it has been found that entropy of the frequency spec-

of the frequency spectrum (En) is defined as: En = −

L−1 ∑

us

trum demonstrates good performance as a global texture descriptor. Entropy

H(zi )log2 [H(zi )]

(3)

an

i=0

where H(zi ) is the normalized histogram and L is the number of gray levels of the frequency spectrum. It computes the randomness of the pixels values

M

of the magnitude spectrum.

Haralick et al. [29] first proposed 14 features to perform GLCM based

ed

texture analysis. Although all of these features are not widely used, they are examined and it has been found that four of them yield good algorithmic

pt

performance. These features are:

1. Contrast (Con):

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Con =

∑∑ i

|i − j|2 N (i, j)

(4)

j

Con measures the contrast in gray levels among neighboring pixels. Its value ranges from zero (for a constant GLCM) to (L − 1)2 . 2. Sum Entropy (SE): SE =

2L ∑

Px+y (i)log[Px+y (i)]

(5)

i=2

14

Page 15 of 70

3. Sum Variance (SV ): SV =

2L ∑

(i − SE)2 Px+y (i)

(6)

i=2

ip t

SV is a measure of variability of the elements of NGLCM with respect to

4. Difference Variance (DV ):

(7)

us

DV = V ariance[Px−y ]

cr

SE.

Px+y (k) =

L ∑ L ∑

an

where Px+y and Px−y are defined as follows: N (i, j)

i + j = k = 2, 3, 4, ...., 2L

Px−y (k) =

L ∑ L ∑

M

i=1 j=1

|i − j| = k = 0, 1, 2, ...., L − 1

N (i, j)

i=1 j=1

(8)

(9)

ed

2.5. Proposed New Feature Descriptor Here, a new local textural feature called Difference Average (DA) which

pt

operates on the NGLCM is proposed. DA is expressed as

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

DA =

L−1 ∑

iPx−y (i).

(10)

i=0

It is evident that DA is the mean of Px−y (k). It expresses the mean value of the pixel differences throughout the entire NGLCM considering the pixel difference a random variable. It was envisioned that this feature would give an idea about the expected pixel difference value of the NGLCM. This information can be valuable in the context of texture classification and its applications such as computer-aided diagnosis. Experimental results show that 15

Page 16 of 70

the proposed feature exhibits discriminating values from bleeding frames to non-bleeding frames which perspicuously evinces that DA can be used in texture classification and other similar applications where the rest of the GLCM

ip t

based texture features are used.

cr

3. Feature Selection

fication in the proposed framework.

an

Statistical Validation on Training Dataset

us

Here we explicate how we choose effective features to perform the classi-

At this point we are faced with two questions. Firstly, how do we choose a set of features from the fourteen Haralick features that have the best discrim-

M

inatory capability? Secondly, how do we make sure that the discriminatory capability of the selected set of features is statistically significant? Statistical

ed

hypothesis testing is the solution to both of the above problems. Although most of the previous papers demonstrate good levels of accu-

pt

racy, they omit the feature selection stage. In other words, they do not provide any statistical background of their feature generation stage. A method that involves a set of selected features without testing for statistical signif-

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

icance can have lethal repercussions. Firstly, it does not say whether the method is actually robust and invariant to data-set. Secondly, it remains unknown whether the discriminatory capability of the features are statistically significant or not. Therefore, statistical hypothesis testing in the context of any signal classification problem is of paramount importance to find out whether the extracted features are informative enough.

16

Page 17 of 70

Table 1: Mean, Standard Deviation and p-values of the Extracted Features

Bleeding SD

Mean

SD

(α = .05)

En

.000236

.000299

.000335

.000412

.0452

Con

10825.3

1476.04

11326.12 1534.78

.008

SE

.000086

.000004

.000058

.008

SV

92.01

2.692

92.54

DV

.28

.17

.25

DA

.0012

.00034

En

.000264

Con

.039

.11

.045

.00115

.00024

.037

.000344

us

cr

.000005 2.23

.000345

.000417

.012

11511.34

1657.94

12070.42 1556.02

.004

SE

.000088

.000007

.000059

.000005

.007

SV

92.003

2.78

92.61

2.47

.039

DV

.3083

.193

.264

.163

.041

DA

.001241

.000363

.0011

.000301

.021

.000241

.000324

.000369

.000429

.008

12962.26

2035.25

13748.1

1780.12

.009

En

pt

Con B

ip t

Mean

an

G

p-values

M

R

Non-Bleeding

Features

ed

Channel

SE

.000088

.000008

.000062

.000009

.008

SV

92.38

3.41

93.05

3.14

.019

DV

.32677

.21775

.27886

.2028

.0424

DA

.00127

.00041

.00116

.000379

.0384

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

To assess whether the values of the features in the two classes differ

significantly, we perform a one-way analysis of variance (ANOVA). The test is carried out in MATLAB’s Statistics Toolbox at 95% confidence level. Hence,

17

Page 18 of 70

a difference is statistically significant if p < α(= 0.05). Any feature having p-value greater than α was discarded. In this way, four out of the fourteen Haralick features are chosen. Table 1 gives the mean, standard deviation

ip t

(SD) and p-values of the features for bleeding and non-bleeding frames in R,

G and B channels. The global feature En, the four Haralick features and our

cr

proposed feature pass the test as we can see from Table 1. Mean and standard

deviation give a rough idea about the feature values of the population. But

us

they do not tell us much about the separability of the descriptors between the two classes. The experimental validation of the efficacy of the selected

an

features will be provided in the experimental results section.

M

4. Support Vector Machine Classifier

Support Vector Machine (SVM) has gained popularity in the last decade due to its widespread application in handwritten digit recognition [30]. SVM

ed

[31] is a supervised machine learning algorithm that maps the data into a higher dimensional feature space by finding a hyperplane with a maxi-

pt

mal margin. The reason for finding a maximum margin hyperplane is that for a binary classification problem, the data in both classes will have more room on each side of the hyperplane. Thus the chance of misclassification

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

is minimized. For a set of N labeled training instances T r = {(xi , yi )|i =

1, 2, ...., N }, where xi ϵIRn ad yi ϵ{−1, 1} an unknown test instance is classified

by:

f (x) = sign (

N ∑

αi yi K(xi , x) + w)

(11)

i=1

where K(xi , x) is the kernel function, w is the bias and αi is the Lagrange multiplier. 18

Page 19 of 70

Some two-class classification problems do not have a simple hyperplane as a useful separating criterion. So besides using linear SVM, we experimented

kernel function K(x, y) can be expressed as: d>0

us

and the radial basis kernel is: K(x, y) = exp(−γ∥x − y∥2 )

(12)

cr

K(x, y) = (x.y + 1)d

ip t

using polynomial and Radial Basis Function (RBF) kernels. The polynomial

γ>0

(13)

an

where ∥.∥ is the Euclidian L2-Norm and γ governs the spread of K.

M

5. Experimental Results and Discussions

Experimentations are carried out to measure the effectiveness of the proposed algorithm empirically. This section provides the details of our experi-

ed

ments. We have evaluated our algorithm against four published algorithms. The results along with rigorous analyses and discussions are presented in this

pt

section.

5.1. Experimental Data

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

More than 3500 WCE images were extracted from 16 bleeding and 16

non-bleeding videos taken from 32 patients. The images were already labeled by experienced clinicians. Only frames that were identical were removed to avoid undesired repetition of images. Non-informative frames contaminated by residual food, turbid fluids, bubbles, specular reflections or fecal materials were not removed. It was done due to the fact that the experimental set-up must emulate real world settings. In real world applications, it is 19

Page 20 of 70

highly unlikely that a clean set of bleeding and non-bleeding images will be available to the clinicians. Therefore, a successful and pragmatic algorithm must be capable of dealing with these frames. So except for the obvious

ip t

repetitive frames, all the extracted frames from the 32 videos were used to

construct the data-set. The training set consisted of 600 bleeding and 600

cr

non-bleeding frames taken from the 12 different patients (i.e., 6 bleeding and 6 non-bleeding patients). This set was used to train the SVM classifier. On

us

the other hand, the test set consisted of 860 bleeding and 860 non-bleeding frames taken from rest of the patients (i.e., 10 bleeding and 10 non-bleeding

an

patients). Therefore, our training and test data do not have images from the same patient. This set was used to evaluate SVM’s classification perfor-

M

mance. Since the size of our data-set was large, we used a publicly available SVM software package called LIBSVM [32]. To eliminate the effect of the peripheral dark regions as we can see in Fig. 2, the original PillCam SB2

ed

images of 576 × 576 were resized to 426 × 426. 5.2. Evaluation Criteria

pt

The objective measures used to evaluate the performance of the proposed method are accuracy, sensitivity and specificity. These measures are often

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

used to determine the performance of algorithms in the literature [6] [11] [12]. They can be expressed by the following formulae: Accuracy =

TP + TN TP + FP + TN + FN

TP TP + FN TN Specificity = TN + FP

Sensitivity =

(14) (15) (16)

20

Page 21 of 70

where TP is the number of bleeding frames identified correctly, TN is the number of non-bleeding frames classified correctly, FP is the number of nonbleeding images identified incorrectly as bleeding and FN is the number of

ip t

bleeding frames misclassified as non-bleeding.

Higher values of sensitivity indicate that the algorithm’s capability of

cr

detecting the bleeding images is high. If the sensitivity is low, the algorithm

is likely to miss many of the bleeding frames- the consequence of which may

us

be colossal for the patient. On the other hand, high specificity means the algorithm is successfully detecting non-bleeding frames reducing the number

an

of false alarms. So for GI hemorrhage detection, sensitivity is more significant than specificity. In general, we expect that a CAD algorithm will demonstrate

M

high values of accuracy, sensitivity and specificity.

Urgent clinical cases of patients may demand quick detection. A computationally expensive algorithm may fail to meet up the demand of the

ed

situation. Therefore, besides having high accuracy, a practically feasible algorithm must be fast. So another measure of performance of any GI bleeding

pt

detection algorithm is its time cost. Since the training is done off-line and does not require the clinician, the classifier training time can be ignored. The time spent by the classifier to classify should only be taken into consideration.

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

5.3. Efficacy of the Selected Features Fig. 4 presents the scatter diagram of the selected features taking two at

a time. The figure clearly shows that the proposed features are separable. So there is a prospect of good classification performance of a linear classifier which as we shall see later, turns out to be true. Besides Fig. 4 correlates with the p-values presented in Table 1 and gives an experimental validation 21

Page 22 of 70

−4

x 10

Scatter Plot of Con vs SE

Scatter Plot of Con vs DV 1.5

Bleeding Non−bleeding

1.4

Bleeding Non−bleeding 1 DV

SE

1.2 1 0.8

0.5

0.6 1

1.2 Con

1.4

0 0.8

1.6

1

4

x 10

Scatter Plot of SV vs DV 120

1.6 4

x 10

us

110 100

1

SV

DV

1.4

Scatter Plot of SE vs SV

Bleeding Non−bleeding

1.5

1.2 Con

cr

0.4 0.8

ip t

1.6

90

an

80

0.5

Bleeding Non−bleeding

70 60

80

1.6

x 10

SV Scatter Plot of En vs SE

3

Bleeding Non−bleeding

1.4

1

0.6 0.5

pt

0.8

1

1.5

2

−3

x 10

1

1.5 SE

−4

x 10

Scatter Plot of Con vs DA Bleeding Non−bleeding

2 1.5 1 0.5 0 0.8

2.5 −3

x 10

1

1.2 Con

1.4

1.6 4

x 10

Ac ce

En

0.5

2.5

ed

1.2

0.4 0

120

DA

−4

100

M

0 60

SE

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Figure 4: Scatter Diagrams of Con vs SE (R-channel), Con vs DV (G-channel), SV

vs DV (R-channel), SE vs SV (B-channel), En vs SE (G-channel) and Con vs DA

(B-channel) show significant variability of the feature values between the two classes.

of our feature selection stage. We conducted more experiments to find out the effectiveness of the selected six features in terms of the standard measures- accuracy, sensitivity 22

Page 23 of 70

Table 2: Performance of Individual Features

Channel

SE

SV

DV

DA

En 98.84

98.90 97.44

98.26

97.91

99.19

98.95 97.75

98.37

98.02

98.95

Specificity(%)

98.84

98.84 97.13

98.14

97.79

98.72

Accuracy(%)

99.07

98.78 98.25

97.91

97.03

97.73

99.42

98.95 98.49

98.13

97.09

97.79

Specificity(%)

98.72

98.6

98.02

97.67

96.97

97.67

Accuracy(%)

98.49

97.09 98.08

97.33

97.03

97.85

Sensitivity(%)

G

Specificity(%)

B

cr

98.72

97.21 98.14

97.56

97.44

98.14

98.26

96.98 98.02

97.09

96.63

97.56

ed

Sensitivity(%)

us

R

an

Sensitivity(%)

ip t

99.01

M

Accuracy(%)

Con

pt

and specificity. Table 2 further elucidates that the proposed set of features is a discriminatory one showing high values of accuracy, sensitivity and specificity for features taken only one at a time. The high values of the three

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

measures prove that our feature set indeed efficaciously captures the textural information we set out to exploit. The results of Table 2 also reflect the findings of the scatter plots in Fig. 4 and speak for the discriminatory capability of the six selected features.

23

Page 24 of 70

Table 3: Performance Evaluation for Various Kernel Functions

Polynomial Kernel

RBF Kernel

Accuracy (%)

99.19

98.34

96.05

Sensitivity (%)

99.41

98.49

Specificity (%)

98.95

98.2

ip t

Linear SVM

96.22

cr

95.87

us

5.4. Choice of Kernel Function and the Position Operation P

Table 3 presents the performance of the proposed method for different

an

kernel functions. In our experiments, the order of polynomial kernel d has been set to 3 and γ of RBF kernel has been set to 10. Linear SVM exhibits

M

99.19% accuracy. This result corroborates with our assumption mentioned before that a linear classifier will work better for this particular choice of feature descriptors. For further simulations, we use linear SVM classifier.

ed

As it was stated earlier in Section 2, the choice of P affects the overall performance of the algorithm. Then what is the best choice of P to achieve

pt

the highest possible accuracy? Further researches were conducted to answer this question. Table 4 gives an indication to how performance varies with P. Here, 0◦ denotes one pixel to the right, 90◦ denotes one pixel above and

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

45◦ denotes the first pixel from the center pixel in the direction of the line bisecting 0◦ and 90◦ . 135◦ refers to the pixel that lies in an angle 45◦ with the 90◦ direction. So a pixel and the pixel immediately to its right is found to be the best choice of P that should be employed in the proposed GI hemorrhage detection algorithm.

24

Page 25 of 70

Table 4: Effect of P on Classification Performance

Accuracy (%)

Sensitivity(%) Specificity(%)

0

99.19

99.41

98.95

45

95

94.71

95.29

90

96.86

97.03

96.63

135

95.32

96.4

cr

ip t

Angle (Degrees)

us

94.24

5.5. Performance Comparison

an

We denote the superpixel based method [7] by SP , the Chrominance Moment and ULBP based method [9] by CM LBP , Raw, Histogram and Ratio feature based method [6] by RHR and the Probabilistic Neural Network

M

based method [5] by P N N . The method proposed in this article is denoted by DF T N GLCM . Fig. 5 shows the accuracy, sensitivity and specificity

ed

of the proposed technique against SP, CM LBP, P N N and RHR. Here, DF T N GLCM − R means that only R-channel was used, DF T N GLCM −

pt

G implies that only G-channel was used and DF T N GLCM − B means that only B-channel was used to perform classification. While implementing the methods in performance comparison, the parameters were chosen such that

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

a particular implementation produces the best accuracy. For CM LBP the simulations were done using 5, 10, 15, 20, 25 and 30 neurons. We utilized a publicly available LBP package [33] to simulate the ULBP part of this work. The result that demonstrated the highest accuracy were picked. RHR was implemented by downsampling the images by a factor of k (k=3, 9, 17, 21, 25 and 29) as was done in the original paper. Here the best accuracy, sensitivity

25

Page 26 of 70

ip t cr us an M ed pt Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Figure 5: Accuracy, Sensitivity and Specificity Comparison of the Proposed Method and Various State-of-the-art Methods.

26

Page 27 of 70

Table 5: Computational Cost Comparison (Seconds/frame)

CM LBP

P NN

RHR

P roposed

.66

1.61

19.6

.38

.5

ip t

Classification time

SP

and specificity of all the simulations are reported. All the algorithms were

cr

implemented using MATLAB 2013a on a computer with Intel(R) Core(TM)

i5-3470, 3.2 GHz CPU, 4 GB of RAM. The aforementioned data-set was

us

used for all the experiments for meaningful comparison. Fig. 5 presents the comparison results. Despite its simplicity, the DFT based texture descriptor

an

based classification algorithm emerged as the most successful one in terms of all the three standard measures as we can see in Fig. 5. Table 5 presents the comparison of speed of the proposed technique

M

against SP, CM LBP, P N N and RHR. Although RHR is slightly faster than the proposed method as we can see in Table 5, the former has only

ed

73.72% accuracy, 75.35% sensitivity, 72.09% specificity. Due to lack of high values of both accuracy and time, RHR is unsuitable for practical applica-

pt

tion. In addition, the low computational cost of the proposed scheme is also promising for real-time hemorrhage detection from WCE videos.

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

5.6. Discussion

In this section we provide analysis of the results of our experimentations.

Although it was hinted in the ‘Our Method’ subsection of Section 1, here we attempt to add a few more details to the following question- what previously unresolved issues were the proposed algorithm able to overcome? A few factors contributed to the high values of accuracy, sensitivity and specificity, lower value of execution time and made this work an important 27

Page 28 of 70

ip t cr

(b)

(c)

M

an

us

(a)

(e)

ed

(d)

(f)

Figure 6: (a)-(c) Endoscopic Video Frames Containing Tiny Bleeding Regions and (d)-

pt

(f) The Corresponding Magnitude Spectrums. Bleeding Regions are Indicated With Blue Contours in (a)-(c). Whole Image and Patch Based Methods Often Fail to Classify These Frames.

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

step toward fully automated GI hemorrhage detection. Firstly, for small bleeding regions the proposed method works better than whole image or patch based methods. Fig. 6 (a)-(c) show three frames containing small bleeding regions. These tiny bleeding regions span a very small number of selected patches. Therefore, they do not contribute much to the feature values, risking the chance of misclassification in a patch based approach. 28

Page 29 of 70

ip t cr us

an

Figure 7: (a) Non-bleeding Image Containing Villi and Cavity and (b) Its Magnitude Spectrum. Note the absence of diagonal lines in the magnitude spectrum that facilitates

M

proper classification.

Again, whole image based methods rely on various statistical features of the entire image. A few bleeding pixels hardly affect or alter values of whole

ed

image based statistical features resulting missing detection and hence low sensitivity. In the frequency domain, as seen from Fig. 6 (d)-(f), these bleed-

pt

ing frames are marked by the appearance of non-horizontal and non-vertical lines, making it easier and more suitable to be captured by the textural features. As a result, our method was able to solve the small bleeding region

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

problem by taking the feature extraction process altogether in the frequency domain and considering features from the whole magnitude spectrum. This also accounts for the high accuracy of the proposed scheme. Secondly, the proposed method was also able to overcome the limitation of [7] of detecting non-bleeding frames that contain fluid, intestinal villi and cavity making the hue of the image dark red. This is illustrated in Fig. 7. The non-bleeding

29

Page 30 of 70

WCE frame in Fig.7(a) often gets misclassified as bleeding due to its dark red region created by intestinal villi. However, as the intensity varies slowly from dark red to bright region, unlike Fig. 6 diagonal lines are not gener-

ip t

ated in the ±45◦ directions of the magnitude spectrum and the image is not

misclassified as bleeding. Thirdly, the efficacy of our method lies partially

cr

in the choice of a discriminatory set of features too as we can see from the

p-values in Table 1. The scatter diagrams of Fig. 4 and results of Table 2

us

also reveal that the performance of the individual features are quite well. In fact, this actually gives rise to the question- if the performance of the fea-

an

tures considered individually is good, why are we using all the six features in conjunction, instead of using only one? The reason is that using only one

M

feature will make the algorithm less robust. In other words, there will be greater variations of algorithmic performance among various data-sets, rendering the detection scheme less reliable. Furthermore, an algorithm reliant

ed

on a feature alone is likely to fail to capture the differences of bleeding and non-bleeding magnitude spectrums. Since we have employed six features to

pt

capture the differences between bleeding and non-bleeding magnitude spectrums, it is very likely that one or more feature will always be able to capture the difference mathematically irrespective of the data-set. Besides, a detec-

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

tion algorithm based on only one feature is rather ambitious and cannot ensure that it will work in real-world settings. We now explain the benefit and justification of including our proposed

feature, namely the difference average (DA). The DA is proposed with a view to capturing textural information from NGLCM. It signifies the mean value of the intensity differences of the NGLCM. It is also related to Differ-

30

Page 31 of 70

−3

x 10

Box−whisker plot of DA (R−channel)

5

ip t

4

cr

3 2

us

1 0

Bleeding

M

an

Non−bleeding

Figure 8: Box-whisker plot of DA suggests good variation of the feature in the two classes.

ed

ence Variance (DV ). DV measures the dispersion of the elements of NGLCM with respect to DA. If the values of the elements of NGLCM exhibit greater

pt

variability with respect to one another, the value of DA will be higher. The opposite scenario will cause DA to become lower. Fig. 8 illustrates the boxwhisker plot of DA on the test dataset. The non-overlapping notches (marked

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

in red) of the two box-whisker plots manifest the difference of DA values between the two classes. We have also calulated the mean and SD values of DA for the test dataset as presented in Table 6. It is seen that the mean and SD values of DA on the test dataset for the two classes are significantly different. Therefore, DA is an efficacious feature that extracts important classification related information from NGLCM. DA can also be implemented for other

31

Page 32 of 70

Table 6: Mean and Standard Deviation of DA on the test dataset

Channel

Bleeding Mean

Non-Bleeding

SD

Mean

SD

.0013

.00036526 .00074454 0.0014

G

.0013

.00036531

.000748

.0014

B

.0013

.00036556

.0007518

0.0014

(b)

pt

(a)

ed

M

an

us

cr

ip t

R

Figure 9: Two Misclassified Frames- (a) False Detection (b) Missing Detection.

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

image analysis and classification problems where NGLCM is employed. We now discuss some misclassification cases of the proposed algorithm.

Fig. 9 shows two cases of false and missing detections of our algorithm. It is found that images containing bubbles and specular reflection exhibit similar patterns as bleeding in the frequency domain. As a result, these frames can also get classified as bleeding. Again, the proposed scheme fails to detect

32

Page 33 of 70

bleeding frames if the bleeding region is completely engulfed by intestinal villi or cavity. However, compared to the thousands of images that have to be screened in one examination, these frames are too small in number to

ip t

drastically affect classification accuracy.

cr

6. Future Directions

In this section, we identify some directions of future research and the

us

possible areas that this work can be extended. Firstly, this algorithm’s computational cost can further be reduced by implementing it on CUDA based

an

Graphical Processing Unit (GPU). Secondly, the proposed approach can be implemented to solve other image classification problems such as medical

M

image classification, image retrieval, remote sensing etc. Thirdly, the proposed method can be extended for other CAD problems of WCE videos such as Crohn’s disease detection, ulcer detection, tumor detection etc. as well.

ed

Fourthly, future studies can also explore to combine multiple classifiers to enhance the algorithm’s performance. Classifier boosting can be adopted for

pt

better classification performance as well. Fifthly, how the algorithm behaves for different choices of the position operator P- may also be an interesting topic of further research.

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

7. Conclusion

In this work, the problem of computer-aided GI bleeding detection was

addressed by selecting feature descriptors from the NGLCM of the images in the frequency domain. The accuracy of the scheme is promising. The proposed algorithm was evaluated against previously published works. The 33

Page 34 of 70

results of performance comparison were also significant. The superiority of the algorithm was also confirmed by statistical hypothesis testing and graphical analysis. We can expect that the proposed method will be ideal

ip t

for practical implementation since it does not require any sort of human

intervention like selecting informative patches and therefore making WCE

cr

technology less problematic and convenient for both patients and clinicians. We thus come into a conclusion, as the experimental results suggest, the

us

devised scheme is simple, yet effective and efficient.

an

Acknowledgment

The authors would like to thank Given Imagine Ltd. for generously pro-

M

viding the WCE data (www.capsuleendoscopy.org).

ed

References

[1] G. Iddan, G. Meron, A. Glukhovsky, P. Swain, Wireless capsule en-

pt

doscopy, Nature 405 (2000) 417–417. [2] http://www.fda.gov/cdrh/mda/docs/k010312.html. [3] http://www.givenimaging.com.

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

[4] S. Liangpunsakul, L. Mays, D. K. Rex, Performance of given suspected blood indicator, Am J Gastroenterol 98 (12) (2003) 2676–2678.

[5] G. Pan, G. Yan, X. Qiu, J. Cui, Bleeding detection in wireless capsule endoscopy based on probabilistic neural network, Journal of Medical Systems 35 (6) (2011) 1477–1484. 34

Page 35 of 70

[6] J. Liu, X. Yuan, Obscure bleeding detection in endoscopy images using support vector machines, Optimization and Engineering 10 (2) (2009)

ip t

289–299. [7] Y. Fu, W. Zhang, M. Mandal, M.-H. Meng, Computer-aided bleeding detection in wce video, Biomedical and Health Informatics, IEEE Jour-

cr

nal of 18 (2) (2014) 636–642.

us

[8] B. Penna, T. Tillo, M. Grangetto, E. Magli, G. Olmo, A technique for blood detection in wireless capsule endoscopy images, in: Proc of the

an

17th European Signal Processing Conference (EUSIPCO09). Glasgow, Scotland, 2009, pp. 1864–1868.

M

[9] B. Li, M.-H. Meng, Computer-aided detection of bleeding regions for capsule endoscopy images, Biomedical Engineering, IEEE Transactions

ed

on 56 (4) (2009) 1032–1039.

[10] M. Coimbra, J. Cunha, Mpeg-7 visual descriptor’s contributions for au-

pt

tomated feature extraction in capsule endoscopy, Circuits and Systems for Video Technology, IEEE Transactions on 16 (5) (2006) 628–637. [11] R. Kumar, Q. Zhao, S. Seshamani, G. Mullin, G. Hager, T. Dassopou-

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

los, Assessment of crohn’s disease lesions in wireless capsule endoscopy images, Biomedical Engineering, IEEE Transactions on 59 (2) (2012) 355–362.

[12] B. Li, M.-H. Meng, Tumor recognition in wireless capsule endoscopy images using textural features and svm-based feature selection, Infor-

35

Page 36 of 70

mation Technology in Biomedicine, IEEE Transactions on 16 (3) (2012) 323–329.

ip t

[13] B. Giritharan, X. Yuan, J. Liu, B. Buckles, J. Oh, S. J. Tang, Bleeding detection from capsule endoscopy videos, in: Engineering in Medicine and Biology Society, 2008. EMBS 2008. 30th Annual International Con-

cr

ference of the IEEE, 2008, pp. 4780–4783.

us

[14] B. Li, M. Q.-H. Meng, J. Y. Lau, Computer-aided small bowel tumor detection for capsule endoscopy, Artificial intelligence in medicine 52 (1)

an

(2011) 11–16.

[15] M. Boulougoura, E. Wadge, V. Kodogiannis, H. S. Chowdrey, Intelligent

M

systems for computer-assisted clinical endoscopic image analysis, Acta Press.

ed

[16] S. Hwang, J. Oh, J. Cox, S. J. Tang, H. F. Tibbals, Blood detection in wireless capsule endoscopy using expectation maximization clustering,

pt

in: Proc. SPIE, Vol. 6144, 2006, pp. 1–11. [17] P. Y. Lau, P. Correia, Detection of bleeding patterns in wce video using multiple features, in: Engineering in Medicine and Biology Society, 2007.

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

EMBS 2007. 29th Annual International Conference of the IEEE, 2007, pp. 5601–5604.

[18] Scale invariant texture descriptors for classifying celiac disease, Medical Image Analysis 17 (4) (2013) 458 – 474.

36

Page 37 of 70

[19] J. Cunha, M. Coimbra, P. Campos, J. M. Soares, Automated topographic segmentation and transit time estimation in endoscopic capsule

ip t

exams, Medical Imaging, IEEE Transactions on 27 (1) (2008) 19–27. [20] M. Liedlgruber, A. Uhl, Computer-aided decision support systems for

endoscopy in the gastrointestinal tract: A review, Biomedical Engineer-

cr

ing, IEEE Reviews in 4 (2011) 73–88.

us

[21] C.-M. Pun, M.-C. Lee, Log-polar wavelet energy signatures for rotation and scale invariant texture classification, Pattern Analysis and Machine

an

Intelligence, IEEE Transactions on 25 (5) (2003) 590–603. [22] G. Liu, Z. Lin, Y. Yu, Radon representation-based feature descriptor for

M

texture classification, Image Processing, IEEE Transactions on 18 (5) (2009) 921–928.

ed

[23] X. Zhang, N. Younan, C. O’Hara, Wavelet domain statistical hyperspectral soil texture classification, Geoscience and Remote Sensing, IEEE

pt

Transactions on 43 (3) (2005) 615–618. [24] S. Karkanis, D. Iakovidis, D. Maroulis, D. Karras, M. Tzivras, Computer-aided tumor detection in endoscopic video using color wavelet

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

features, Information Technology in Biomedicine, IEEE Transactions on 7 (3) (2003) 141–152.

[25] R. C. Gonzalez, R. E. Woods, Digital Image Processing, 3rd Edition, Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 2006.

[26] T. Pappas, D. Neuhoff, H. de Ridder, J. Zujovic, Image analysis: Focus on texture similarity, Proceedings of the IEEE 101 (9) (2013) 2044–2057. 37

Page 38 of 70

[27] J. Portilla, E. P. Simoncelli, A parametric texture model based on joint statistics of complex wavelet coefficients, International Journal of Com-

ip t

puter Vision 40 (1) (2000) 49–70. [28] P. Chiranjeevi, S. Sengupta, Moving object detection in the presence of dynamic backgrounds using intensity and textural features, Journal of

cr

Electronic Imaging 20 (4) (2011) 043009–043009–11.

us

[29] R. Haralick, K. Shanmugam, I. Dinstein, Textural features for image classification, Systems, Man and Cybernetics, IEEE Transactions on

an

SMC-3 (6) (1973) 610–621.

[30] C.-L. Liu, K. Nakashima, H. Sako, H. Fujisawa, Handwritten digit recog-

M

nition: benchmarking of state-of-the-art techniques, Pattern Recognition 36 (10) (2003) 2271 – 2285.

ed

[31] L. Wang, Support Vector Machines: Theory and Applications.New York, New York:Springer-Verlag, 2005.

pt

[32] C.-C. Chang, C.-J. Lin, LIBSVM: A library for support vector machines, ACM Transactions on Intelligent Systems and Technology 2 (2011) 27:1– 27:27.

Ac ce

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

[33] http://www.cse.oulu.fi/cmv/downloads/lbpmatlab.

38

Page 39 of 70

Ac ce p

te

d

M

an

us

cr

ip t

4.eps

Page 40 of 70

Ac ce p

te

d

M

an

us

cr

ip t

5.eps

Page 41 of 70

Ac ce p

te

d

M

an

us

cr

ip t

accuracy.eps

Page 42 of 70

Ac ce p

te

d

M

an

us

cr

ip t

b1.eps

Page 43 of 70

Ac ce p

te

d

M

an

us

cr

ip t

b2.eps

Page 44 of 70

Ac ce p

te

d

M

an

us

cr

ip t

blockdiag1.eps

Page 45 of 70

−3

x 10

M

an

us

cr

ip t

cmpb_rebuttal_box1.eps

Box−whisker plot of DA (R−channel)

3 2

te Ac ce p

4

d

5

1 0 Non−bleeding

Bleeding Page 46 of 70

Ac ce p

te

d

M

an

us

cr

ip t

fp1_spect.eps

Page 47 of 70

Ac ce p

te

d

M

an

us

cr

ip t

fp2.eps

Page 48 of 70

Ac ce p

te

d

M

an

us

cr

ip t

fp2_spect.eps

Page 49 of 70

Ac ce p

te

d

M

an

us

cr

ip t

fp3_spect.eps

Page 50 of 70

Ac ce p

te

d

M

an

us

cr

ip t

illustration.eps

Page 51 of 70

Ac ce p

te

d

M

an

us

cr

ip t

n2.eps

Page 52 of 70

Ac ce p

te

d

M

an

us

cr

ip t

n4.eps

Page 53 of 70

Ac ce p

te

d

M

an

us

cr

ip t

nb1.eps

Page 54 of 70

Ac ce p

te

d

M

an

us

cr

ip t

nb2.eps

Page 55 of 70

−3

DA

2 1.5

d te

2.5

Scatter Plot of Con vs DA Bleeding Non−bleeding

Ac ce p

3

x 10

M

an

us

cr

ip t

noname.eps

1 0.5 0 0.8

1

1.2 Con

1.4

1.6

Page 56 of 70

4

x 10

−4

SE

1.2 1

d te

1.4

Scatter Plot of Con vs SE Bleeding Non−bleeding

Ac ce p

1.6

x 10

M

an

us

cr

ip t

noname1.eps

0.8 0.6 0.4 0.8

1

1.2 Con

1.4

1.6

Page 57 of 70

4

x 10

M

an

us

cr

ip t

noname2.eps

Scatter Plot of Con vs DV

DV

1

Bleeding Non−bleeding

Ac ce p

te

d

1.5

0.5

0 0.8

1

1.2 Con

1.4

1.6

Page 58 of 70

4

x 10

M

an

us

cr

ip t

noname3.eps

1

Bleeding Non−bleeding

Ac ce p

DV

1.5

te

d

Scatter Plot of SV vs DV

0.5

0 60

80

100 SV

120

Page 59 of 70

M

an

us

cr

ip t

noname4.eps

Scatter Plot of SE vs SV

SV

100 90

te

Bleeding Non−bleeding

Ac ce p

110

d

120

80 70 60

0.5

1

1.5 SE

Page 60 of 70

−4

x 10

−4

SE

1.2 1

d

Bleeding Non−bleeding

te

1.4

Scatter Plot of En vs SE

Ac ce p

1.6

x 10

M

an

us

cr

ip t

noname5.eps

0.8 0.6 0.4 0

0.5

1

1.5 En

2

2.5

Page 61 of 70

−3

x 10

Ac ce p

te

d

M

an

us

cr

ip t

sensitivity.eps

Page 62 of 70

Ac ce p

te

d

M

an

us

cr

ip t

small1_marked.eps

Page 63 of 70

Ac ce p

te

d

M

an

us

cr

ip t

small2_marked.eps

Page 64 of 70

Ac ce p

te

d

M

an

us

cr

ip t

small3_marked.eps

Page 65 of 70

Ac ce p

te

d

M

an

us

cr

ip t

specificity.eps

Page 66 of 70

Ac ce p

te

d

M

an

us

cr

ip t

tn.eps

Page 67 of 70

Ac ce p

te

d

M

an

us

cr

ip t

tn1.eps

Page 68 of 70

Ac ce p

te

d

M

an

us

cr

ip t

tn3_spect.eps

Page 69 of 70

Conflicts of Interest

Ac ce p

te

d

M

an

us

cr

ip t

We have no conflict of interest with anybody.

Page 70 of 70