Accepted Manuscript Title: Computer-Aided Gastrointestinal Hemorrhage Detection in Wireless Capsule Endoscopy Videos Author: Ahnaf Rashik Hassan Mohammad Ariful Haque PII: DOI: Reference:
S0169-2607(15)00232-1 http://dx.doi.org/doi:10.1016/j.cmpb.2015.09.005 COMM 3974
To appear in:
Computer Methods and Programs in Biomedicine
Received date: Revised date: Accepted date:
5-5-2015 27-8-2015 1-9-2015
Please cite this article as: Ahnaf Rashik Hassan, Mohammad Ariful Haque, Computer-Aided Gastrointestinal Hemorrhage Detection in Wireless Capsule Endoscopy Videos, Computer Methods and Programs in Biomedicine (2015), http://dx.doi.org/10.1016/j.cmpb.2015.09.005 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Computer-Aided Gastrointestinal Hemorrhage Detection in Wireless Capsule Endoscopy Videos
ip t
Ahnaf Rashik Hassan, Mohammad Ariful Haque∗
pt
ed
M
an
us
cr
Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh.
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
∗
Corresponding Author: Mohammad Ariful Haque, Phone: +8801864977369, Email:
[email protected] Preprint submitted to Computer Methods and Programs in Biomedicine
July 22, 2015
Page 1 of 70
Computer-Aided Gastrointestinal Hemorrhage Detection in Wireless Capsule Endoscopy Videos
ip t
Ahnaf Rashik Hassan, Mohammad Ariful Haque∗
cr
Department of Electrical and Electronic Engineering, Bangladesh University of Engineering and Technology, Dhaka-1205, Bangladesh.
us
Abstract
Background and objective: Wireless Capsule Endoscopy (WCE) can image
an
the portions of the human gastrointestinal tract that were previously unreachable for conventional endoscopy examinations. A major drawback of
M
this technology is that a large volume of data are to be analyzed in order to detect a disease which can be time-consuming and burdensome for the clinicians. Consequently, there is a dire need of computer-aided disease detection
ed
schemes to assist the clinicians. In this paper, we propose a real-time, computationally efficient and effective computerized bleeding detection technique
pt
applicable for WCE technology.
Methods: The development of our proposed technique is based on the observation that characteristic patterns appear in the frequency spectrum
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
of the WCE frames due to the presence of bleeding region. Discovering these discriminating patterns, we develop a texture-feature-descriptor-basedalgorithm that operates on the Normalized Gray Level Co-occurrence Matrix (NGLCM) of the magnitude spectrum of the images. A new local texture de∗
Corresponding Author: Mohammad Ariful Haque, Phone: +8801864977369, Email:
[email protected] Preprint submitted to Computer Methods and Programs in Biomedicine August 22, 2015
Page 2 of 70
scriptor called difference average that operates on NGLCM is also proposed. We also perform statistical validation of the proposed scheme. Results: The proposed algorithm was evaluated using a publicly available
ip t
WCE database. The training set consisted of 600 bleeding and 600 nonbleeding frames. This set was used to train the SVM classifier. On the
cr
other hand, 860 bleeding and 860 non-bleeding images were selected from the
rest of the extracted images to form the test set. The accuracy, sensitivity
us
and specificity obtained from our method are 99.19%, 99.41% and 98.95% respectively which are significantly higher than state-of-the-art methods. In
an
addition, the low computational cost of our method makes it suitable for real-time implementation.
M
Conclusion: This work proposes a bleeding detection algorithm that employs textural features from the magnitude spectrum of the WCE images. Experimental outcomes backed by statistical validations prove that the pro-
ed
posed algorithm is superior to the existing ones in terms of accuracy, sensitivity, specificity and computational cost.
pt
Keywords: Wireless Capsule Endoscopy (WCE), Bleeding Detection, Support Vector Machine, Normalized Gray Level Co-occurrence Matrix.
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
1. Introduction
The classical endoscopic procedure has enabled clinicians to investigate
the human gastro-intestinal (GI) tract. Despite being efficacious for the upper (duodenum, stomach and food pipe) and lower part (colon and terminal ileum) of the GI tract, the traditional endoscopy miserably fails to examine the small intestine. The human small intestine is about 8 meters 2
Page 3 of 70
long and conventional endoscopy such as Colonoscopy or Esophagogastroduodenoscopy cannot image it satisfactorily. To overcome the limitations of traditional endoscopy, G. Iddan et. al. [1] pioneered the invention of wire-
ip t
less capsule endoscopy (WCE). The WCE system consists of a pill-shaped
capsule. The capsule has a built-in video camera, video signal transmitter,
cr
light-emitting diode and a battery. It is swallowed by the patient and is
propelled forward by peristalsis of human GI tract. It records images as it
us
moves forward along the GI tract and transmits them at the same time using radio frequency. It transmits over the course of about 8 hours until its
an
battery runs out. Due to its promising performance for the visualization of human GI tract, U.S. Food and Drug Administration (FDA) approved it in
1.1. Problem Description
M
2001 [2].
ed
Manual classification of bleeding and non-bleeding endoscopic video frames has a number of limitations. The power supply of the capsule has limitations which result in low resolution (576 × 576) of endoscopic video frames. The
pt
video frame rate is also low (2 frames/second). Besides, about 60,000 images have to be inspected per examination. It takes an experienced clinician about
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
2 hours which may not be pragmatic in most clinical scenarios. Since the evaluation process is time-consuming and a large volume of images have to be inspected, bleeding detection becomes more subject to human error. So a computer-aided detection (CAD) of bleeding frames can make this monumental task easy for clinicians. Given Imaging Ltd. [3] designed a software called Suspected Blood Indicator (SBI) for automatic detection of bleeding frames. But SBI demonstrates 3
Page 4 of 70
poor sensitivity and specificity and often fails to detect any kind of bleeding other than that of the small intestine [4]. The software designed by Given Imaging Ltd. allows the physician to view two consecutive frames at the same
ip t
time. But due to low frame rate, two consecutive frames may not contain the area of interest. Consequently, the clinician has to toggle between images
cr
making the evaluation process even more onerous and time-consuming. All
the aforementioned problems of manual screening can be eliminated by the
us
use of CAD.
an
1.2. Related Work
The previous works on GI hemorrhage detection can roughly be classified as : color based, texture based and color and texture based methods.
M
Color based methods [5] [6] [7] [8] basically exploit the ratios of the intensity values of the images in the RGB or HSI domain. Texture based approaches
ed
attempt to utilize the textural content of bleeding and non-bleeding images to perform classification [9] [10] [11]. It has been reported that the combination of color and texture descriptors exhibit good performance in terms of
pt
accuracy [12]. Again, depending on the region of operation, CAD bleeding and tumor detection literature can be categorized into three groups -whole
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
image based [12] [13] [14] [15], pixel based [5] [6] [16] and patch based methods [9] [17] as in [7]. Whole image based methods are fast but often fail to detect small bleeding regions. Pixel based methods have to operate on each pixel of the image to generate the feature vectors. As a result, they are computationally very expensive. It can be expected that patch based methods will achieve good accuracy while keeping the computational cost low. However, patch based methods show high sensitivity but low specificity 4
Page 5 of 70
and accuracy. Besides, informative patches need to be identified manually by a clinician which hinders the idea of making the whole process automatic. B. Li and Q. Meng [9] put forward a chrominance moment and Uniform
ip t
Local Binary Pattern (ULBP) based solution to bleeding detection. Yanan
Fu et al. [7] came up with a super-pixel and red ratio based solution that was
cr
promising in terms of accuracy. But it was reported that this method has a high computational cost and fails to detect images with poor illumination and
us
minor angiodysplasia regions whose hue is similar to normal tissue. Hwang et al. [16] utilized Expectation Maximization Clustering algorithm for CAD
an
of bleeding frames. Some prior works [10] [11] employed MPEG-7 based visual descriptors to identify medical events such as blood, ulcer and Crohn’s
M
disease lesions. Pan et al. [5] formed a 6-D feature vector using R,G,B,H,S,I values and used probabilistic neural network (PNN) as classifier. Liu et al. [6] proposed Raw, Ratio and Histogram feature vectors which are basically the
ed
intensity values of the image pixels and used support vector machine (SVM) to detect GI bleeding. Hegenbart et. al. [18] utilized scale invariant wavelet
pt
based texture features to detect Celiac disease in endoscopic videos. Using MPEG-7 based visual descriptors, Bayesian and SVM, Cunha et. al. [19] segmented the GI tract into four major topographic areas and performed
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
image classification. For a more comprehensive review on computer aided decision support system for WCE videos, [20] can be consulted. 1.3. Our Method
In this work, we aim to draw inferences (bleeding or non-bleeding) on the
spatial domain of an image by extracting features in the frequency domain. Fig. 1 depicts the steps of the proposed scheme. At first, we compute Dis5
Page 6 of 70
ip t cr us an M
ed
Figure 1: Block diagram of the proposed framework.
crete Fourier Transform (DFT) of the endoscopic video frames. Afterwards
pt
we take the log transform of the magnitude spectrum of the frames. Normalized Gray Level Co-occurrence Matrix (NGLCM) matrix is then constructed to extract features from each log transformed magnitude spectrum. The se-
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
lected features are computed from NGLCM. The features are then fed into support vector machine classifier to perform classification of the frames. There are significant distinctions between the proposed approach and
previous studies on bleeding detection in the literature. It has also some advantages. Both are described as follows. • To the best of the authors’ knowledge, none of the existing works in 6
Page 7 of 70
the literature on GI bleeding detection attempt to solve the problem in the frequency domain. However, spectral texture descriptors have been used for other image classification problems such as texture classifica-
ip t
tion [21] [22], remote sensing image classification [23], tumor recognition in colonoscopy images [24] etc.
cr
• Most of the state-of-the-art works use either all [5] [7] [8] or any two [6]
us
of the R, G and B channels. An advantage of this method lies in the fact that it uses only one channel. That is any one of the three channels
an
can be used.
• In this work we propose ‘difference average’, a new feature that can be implemented on the NGLCM. The experimental results of this feature
M
are promising.
ed
• Our algorithm shows promising performance to correctly classify tiny bleeding regions.
pt
• The proposed scheme requires less number of features than many of the existing methods such as [15] [17]. Lastly, we conduct our experiments on a large data-set. This ensures reliability and effectiveness of our
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
algorithm in practical scenarios because an algorithm that only works well on a small data-set cannot ensure that it will work in real-world implementations.
One major concern of this approach may be the computational cost as-
sociated with the computation of DFT of the images. But with the development and implementation of fast Fourier transform (FFT) algorithms, one 7
Page 8 of 70
can expect that this approach will not be computationally costly. Besides, the method is non-iterative. These assumptions are later supported by experimental results that show the proposed method is indeed computationally
ip t
inexpensive. The proposed algorithm outperforms the state-of-the-art meth-
ods implemented on the same data-set in accuracy, sensitivity and specificity.
cr
Computer-aided GI bleeding detection is a machine learning problem and has three basic steps- feature generation or extraction, feature selection and
us
classification. Hence, the rest of the article is organized as follows: Section 2 expounds the feature generation part of our algorithm, intuitively describes
an
the reason behind the choice of the proposed feature extraction scheme, provides mathematical formulation of the selected features and introduces the
M
new feature we propose in this work. In Section 3, we statistically prove that the differences of the chosen features are statistically significant. Section 4 describes the classifier we choose in the proposed method. We provide
ed
the details of the experiments conducted to demonstrate the efficacy and superiority of our algorithm, present the experimental results and explicate
pt
their significance in Section 5. Finally, Section 6 presents how this work can further be extended and Section 7 concludes the article.
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
2. Feature Extraction in the Fourier Domain We initiate this section with a brief review of spectral estimation using
the DFT. The reason for choosing the DFT based texture descriptor is then expounded. We then discuss the construction of the NGLCM and provide the mathematical expressions of the features of our algorithm.
8
Page 9 of 70
2.1. Spectral Estimation using the DFT At first we compute the Fourier spectrum of each of the endoscopic video M × N can be expressed as: px
qy
f (x, y)e−j2π( M + N )
cr
F (p, q) =
M −1 N −1 ∑ ∑
ip t
frames. The 2-D Discrete Fourier Transform of a WCE image f (x, y) of size
x=0 y=0
(1)
us
where p = 0, 1, 2, 3, ....., M − 1 and q = 0, 1, 2, 3, ....., N − 1. The frequency spectrum of an image can be obtained from the absolute value of its DFT.
an
The frequency spectrum of an image is a measure of its frequency distribution which can generate patterns depending on the content of the image in the spatial domain. In general, high frequency components indicate sharp
M
transition of intensities and low frequency components indicate intensities with a slow rate of change. In bleeding images, there will be sharp transi-
ed
tions of intensities from bleeding regions to their neighboring non-bleeding regions. These transitions will be absent in the non-bleeding frames. This is precisely why the magnitude spectrum can be useful for this particular
pt
application. Furthermore, log transformation is used with a view to reducing computational cost. An important trait of log transformation is that it
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
compresses the dynamic range of images with large variations in the pixel intensities [25]. Fourier spectrum of an image contains a gargantuan dynamic range of values. Magnitude spectrum values may range from 0 to as large as 107 or even higher. This ensure that the maximum intensity values are not too high so that we can have a GLCM of manageable size. For instance, without applying log transformation if the highest value in the Fourier spectrum was 107 , then the size of each GLCM would have been 107 ×107 making 9
Page 10 of 70
Bleeding cases
ed
Non-bleeding cases
M
an
us
cr
ip t
the algorithm computationally very expensive.
Figure 2: Left two columns: typical non-bleeding WCE frames (top) and the corresponding magnitude spectrums (bottom). Right two columns: typical bleeding WCE frames
pt
(top) and the corresponding magnitude spectrums (bottom). Note the non-horizontal and non-vertical lines appearing in the bleeding spectrum. The proposed method aims to quantify these characteristics to generate discriminating features for the supervised
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
learning algorithm.
2.2. Why DFT-Based Texture Descriptor We know that in signal classification if the transform is suitably chosen,
transform domain features may exhibit more useful classification-related information than the original signal. Keeping that in mind, this work explores to forge a connection between the spatial and the frequency domain of an 10
Page 11 of 70
image to generate meaningful feature descriptors and classify bleeding and non-bleeding frames. It is observed from numerous visual inspections that the frequency spectrum of the bleeding images tend to show straight lines
ip t
near or along the diagonal directions. However, these lines are absent in the non-bleeding frequency spectrum. This observation is illustrated in Fig.
cr
2. The right two columns of Fig. 2 shows typical bleeding WCE frames,
where spectral lines are generated in the ±45◦ directions of the magnitude
us
spectrum. No such patterns appear in non-bleeding frames as we can see from the left two columns of Fig. 2. This phenomenon can be exploited to
an
generate features for the machine learning algorithm.
To capture the above mentioned visual observations mathematically, the
M
most appropriate strategy is to use texture feature descriptors. Texture is essential for both human visual perception and image analysis [26]. However, texture does not have any widely agreed definition. Portilla and Simon-
ed
celli [27] provide a general definition with which many researchers agree: “Loosely speaking, texture images are spatially homogeneous and
pt
consist of repeated elements, often subject to some randomization in their location, size, color, orientation etc.”
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
Since, texture features are widely used to detect various texture patterns in images, they are the most appropriate descriptors in the proposed frequency domain based framework. Texture measures are used to quantify the observed visual differences of bleeding and non-bleeding spectra and these measures are utilized to train the classifier.
11
Page 12 of 70
2.3. Local vs. Global Texture Descriptor After computing out the magnitude spectrum of the endoscopic frames, we extract textural features from them. In our work, we employ both global
ip t
and local feature descriptors. Global texture features are simple and widely
used in the image classification literature to measure the overall textural con-
cr
tent of the image. Global features are computed considering the image as a whole whereas local feature descriptors operate on a small region or a few
us
pixels. Since in the GI hemorrhage detection problem the bleeding regions may be small and localized, local feature descriptors are very promising for
an
achieving classification result with high accuracy. Global features are measures of distribution of intensities but they carry no information regarding
M
the relative position of pixels with respect to each other [25]. Considering these pros and cons both global and local feature descriptors are utilized to devise the algorithm. Entropy of the magnitude spectrum is used as global
ed
feature descriptor in this study. Contrast, Sum Entropy, Sum Variance, Difference Variance and Difference Average that operate on the NGLCM are
pt
used to capture local textural information. In other words, our feature set consists of a global feature and five local features.
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
2.4. NGLCM and Haralick Features In order to capture local textural information from the spectrum of the
images, we employ features from the NGLCM of the magnitude spectrum of the WCE frames. The Gray Level Co-occurrence Matrix (GLCM) is an L × L matrix of the input image where L is the number of gray levels of the image. Fig. 3 illustrates the construction of GLCM from an image. If two consecutive pixels of the input image have pixel values i and j, then the 12
Page 13 of 70
ip t cr
Figure 3: Construction of Gray Level Co-occurrence Matrix. (a) Different choices of the
us
Position Operator P (b) An image (c) GLCM of the image. For 0◦ , the values of the blue-marked pixel and its immediate right one are noted. The first row and third column
an
element of the GLCM is incremented. In other words, the value 3 in the first row and third column of the GLCM indicates that pixel values 1 and 3 consecutively occur thrice
M
in the whole image. Image Courtesy [28].
(i, j)th element of the GLCM is incremented. This operation is done for every pair of pixels in the image. In this way, the GLCM is formed. The position
ed
operator P governs how this pixels are related to each other. The effect of P on the detection performance will be discussed later in this paper. In this
pt
work, the performance of Normalized Gray Level Co-occurrence Matrix on the frequency spectrum of WCE images is inspected. The NGLCM can be constructed from the GLCM using the following relation:
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
N (i, j) =
G(i, j) R
(2)
where R is the total number of pixel pairs in the GLCM. In essence, NGLCM maps the image to a matrix that indicates the probability of occurrence of two consecutive pixel values. This implies that NGLCM carries local textural information of the image to be extracted by the features. Besides, NGLCM based texture features have been widely used in various applications 13
Page 14 of 70
for texture analysis and image classification. These two factors motivated the use of NGLCM in the proposed method. Various statistical measures such as mean, moment, entropy etc. are used
ip t
as global texture descriptors [25] to measure the overall textural content of the image. These features operate on the entire image. After conducting
cr
repeated experiments, it has been found that entropy of the frequency spec-
of the frequency spectrum (En) is defined as: En = −
L−1 ∑
us
trum demonstrates good performance as a global texture descriptor. Entropy
H(zi )log2 [H(zi )]
(3)
an
i=0
where H(zi ) is the normalized histogram and L is the number of gray levels of the frequency spectrum. It computes the randomness of the pixels values
M
of the magnitude spectrum.
Haralick et al. [29] first proposed 14 features to perform GLCM based
ed
texture analysis. Although all of these features are not widely used, they are examined and it has been found that four of them yield good algorithmic
pt
performance. These features are:
1. Contrast (Con):
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
Con =
∑∑ i
|i − j|2 N (i, j)
(4)
j
Con measures the contrast in gray levels among neighboring pixels. Its value ranges from zero (for a constant GLCM) to (L − 1)2 . 2. Sum Entropy (SE): SE =
2L ∑
Px+y (i)log[Px+y (i)]
(5)
i=2
14
Page 15 of 70
3. Sum Variance (SV ): SV =
2L ∑
(i − SE)2 Px+y (i)
(6)
i=2
ip t
SV is a measure of variability of the elements of NGLCM with respect to
4. Difference Variance (DV ):
(7)
us
DV = V ariance[Px−y ]
cr
SE.
Px+y (k) =
L ∑ L ∑
an
where Px+y and Px−y are defined as follows: N (i, j)
i + j = k = 2, 3, 4, ...., 2L
Px−y (k) =
L ∑ L ∑
M
i=1 j=1
|i − j| = k = 0, 1, 2, ...., L − 1
N (i, j)
i=1 j=1
(8)
(9)
ed
2.5. Proposed New Feature Descriptor Here, a new local textural feature called Difference Average (DA) which
pt
operates on the NGLCM is proposed. DA is expressed as
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
DA =
L−1 ∑
iPx−y (i).
(10)
i=0
It is evident that DA is the mean of Px−y (k). It expresses the mean value of the pixel differences throughout the entire NGLCM considering the pixel difference a random variable. It was envisioned that this feature would give an idea about the expected pixel difference value of the NGLCM. This information can be valuable in the context of texture classification and its applications such as computer-aided diagnosis. Experimental results show that 15
Page 16 of 70
the proposed feature exhibits discriminating values from bleeding frames to non-bleeding frames which perspicuously evinces that DA can be used in texture classification and other similar applications where the rest of the GLCM
ip t
based texture features are used.
cr
3. Feature Selection
fication in the proposed framework.
an
Statistical Validation on Training Dataset
us
Here we explicate how we choose effective features to perform the classi-
At this point we are faced with two questions. Firstly, how do we choose a set of features from the fourteen Haralick features that have the best discrim-
M
inatory capability? Secondly, how do we make sure that the discriminatory capability of the selected set of features is statistically significant? Statistical
ed
hypothesis testing is the solution to both of the above problems. Although most of the previous papers demonstrate good levels of accu-
pt
racy, they omit the feature selection stage. In other words, they do not provide any statistical background of their feature generation stage. A method that involves a set of selected features without testing for statistical signif-
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
icance can have lethal repercussions. Firstly, it does not say whether the method is actually robust and invariant to data-set. Secondly, it remains unknown whether the discriminatory capability of the features are statistically significant or not. Therefore, statistical hypothesis testing in the context of any signal classification problem is of paramount importance to find out whether the extracted features are informative enough.
16
Page 17 of 70
Table 1: Mean, Standard Deviation and p-values of the Extracted Features
Bleeding SD
Mean
SD
(α = .05)
En
.000236
.000299
.000335
.000412
.0452
Con
10825.3
1476.04
11326.12 1534.78
.008
SE
.000086
.000004
.000058
.008
SV
92.01
2.692
92.54
DV
.28
.17
.25
DA
.0012
.00034
En
.000264
Con
.039
.11
.045
.00115
.00024
.037
.000344
us
cr
.000005 2.23
.000345
.000417
.012
11511.34
1657.94
12070.42 1556.02
.004
SE
.000088
.000007
.000059
.000005
.007
SV
92.003
2.78
92.61
2.47
.039
DV
.3083
.193
.264
.163
.041
DA
.001241
.000363
.0011
.000301
.021
.000241
.000324
.000369
.000429
.008
12962.26
2035.25
13748.1
1780.12
.009
En
pt
Con B
ip t
Mean
an
G
p-values
M
R
Non-Bleeding
Features
ed
Channel
SE
.000088
.000008
.000062
.000009
.008
SV
92.38
3.41
93.05
3.14
.019
DV
.32677
.21775
.27886
.2028
.0424
DA
.00127
.00041
.00116
.000379
.0384
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
To assess whether the values of the features in the two classes differ
significantly, we perform a one-way analysis of variance (ANOVA). The test is carried out in MATLAB’s Statistics Toolbox at 95% confidence level. Hence,
17
Page 18 of 70
a difference is statistically significant if p < α(= 0.05). Any feature having p-value greater than α was discarded. In this way, four out of the fourteen Haralick features are chosen. Table 1 gives the mean, standard deviation
ip t
(SD) and p-values of the features for bleeding and non-bleeding frames in R,
G and B channels. The global feature En, the four Haralick features and our
cr
proposed feature pass the test as we can see from Table 1. Mean and standard
deviation give a rough idea about the feature values of the population. But
us
they do not tell us much about the separability of the descriptors between the two classes. The experimental validation of the efficacy of the selected
an
features will be provided in the experimental results section.
M
4. Support Vector Machine Classifier
Support Vector Machine (SVM) has gained popularity in the last decade due to its widespread application in handwritten digit recognition [30]. SVM
ed
[31] is a supervised machine learning algorithm that maps the data into a higher dimensional feature space by finding a hyperplane with a maxi-
pt
mal margin. The reason for finding a maximum margin hyperplane is that for a binary classification problem, the data in both classes will have more room on each side of the hyperplane. Thus the chance of misclassification
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
is minimized. For a set of N labeled training instances T r = {(xi , yi )|i =
1, 2, ...., N }, where xi ϵIRn ad yi ϵ{−1, 1} an unknown test instance is classified
by:
f (x) = sign (
N ∑
αi yi K(xi , x) + w)
(11)
i=1
where K(xi , x) is the kernel function, w is the bias and αi is the Lagrange multiplier. 18
Page 19 of 70
Some two-class classification problems do not have a simple hyperplane as a useful separating criterion. So besides using linear SVM, we experimented
kernel function K(x, y) can be expressed as: d>0
us
and the radial basis kernel is: K(x, y) = exp(−γ∥x − y∥2 )
(12)
cr
K(x, y) = (x.y + 1)d
ip t
using polynomial and Radial Basis Function (RBF) kernels. The polynomial
γ>0
(13)
an
where ∥.∥ is the Euclidian L2-Norm and γ governs the spread of K.
M
5. Experimental Results and Discussions
Experimentations are carried out to measure the effectiveness of the proposed algorithm empirically. This section provides the details of our experi-
ed
ments. We have evaluated our algorithm against four published algorithms. The results along with rigorous analyses and discussions are presented in this
pt
section.
5.1. Experimental Data
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
More than 3500 WCE images were extracted from 16 bleeding and 16
non-bleeding videos taken from 32 patients. The images were already labeled by experienced clinicians. Only frames that were identical were removed to avoid undesired repetition of images. Non-informative frames contaminated by residual food, turbid fluids, bubbles, specular reflections or fecal materials were not removed. It was done due to the fact that the experimental set-up must emulate real world settings. In real world applications, it is 19
Page 20 of 70
highly unlikely that a clean set of bleeding and non-bleeding images will be available to the clinicians. Therefore, a successful and pragmatic algorithm must be capable of dealing with these frames. So except for the obvious
ip t
repetitive frames, all the extracted frames from the 32 videos were used to
construct the data-set. The training set consisted of 600 bleeding and 600
cr
non-bleeding frames taken from the 12 different patients (i.e., 6 bleeding and 6 non-bleeding patients). This set was used to train the SVM classifier. On
us
the other hand, the test set consisted of 860 bleeding and 860 non-bleeding frames taken from rest of the patients (i.e., 10 bleeding and 10 non-bleeding
an
patients). Therefore, our training and test data do not have images from the same patient. This set was used to evaluate SVM’s classification perfor-
M
mance. Since the size of our data-set was large, we used a publicly available SVM software package called LIBSVM [32]. To eliminate the effect of the peripheral dark regions as we can see in Fig. 2, the original PillCam SB2
ed
images of 576 × 576 were resized to 426 × 426. 5.2. Evaluation Criteria
pt
The objective measures used to evaluate the performance of the proposed method are accuracy, sensitivity and specificity. These measures are often
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
used to determine the performance of algorithms in the literature [6] [11] [12]. They can be expressed by the following formulae: Accuracy =
TP + TN TP + FP + TN + FN
TP TP + FN TN Specificity = TN + FP
Sensitivity =
(14) (15) (16)
20
Page 21 of 70
where TP is the number of bleeding frames identified correctly, TN is the number of non-bleeding frames classified correctly, FP is the number of nonbleeding images identified incorrectly as bleeding and FN is the number of
ip t
bleeding frames misclassified as non-bleeding.
Higher values of sensitivity indicate that the algorithm’s capability of
cr
detecting the bleeding images is high. If the sensitivity is low, the algorithm
is likely to miss many of the bleeding frames- the consequence of which may
us
be colossal for the patient. On the other hand, high specificity means the algorithm is successfully detecting non-bleeding frames reducing the number
an
of false alarms. So for GI hemorrhage detection, sensitivity is more significant than specificity. In general, we expect that a CAD algorithm will demonstrate
M
high values of accuracy, sensitivity and specificity.
Urgent clinical cases of patients may demand quick detection. A computationally expensive algorithm may fail to meet up the demand of the
ed
situation. Therefore, besides having high accuracy, a practically feasible algorithm must be fast. So another measure of performance of any GI bleeding
pt
detection algorithm is its time cost. Since the training is done off-line and does not require the clinician, the classifier training time can be ignored. The time spent by the classifier to classify should only be taken into consideration.
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
5.3. Efficacy of the Selected Features Fig. 4 presents the scatter diagram of the selected features taking two at
a time. The figure clearly shows that the proposed features are separable. So there is a prospect of good classification performance of a linear classifier which as we shall see later, turns out to be true. Besides Fig. 4 correlates with the p-values presented in Table 1 and gives an experimental validation 21
Page 22 of 70
−4
x 10
Scatter Plot of Con vs SE
Scatter Plot of Con vs DV 1.5
Bleeding Non−bleeding
1.4
Bleeding Non−bleeding 1 DV
SE
1.2 1 0.8
0.5
0.6 1
1.2 Con
1.4
0 0.8
1.6
1
4
x 10
Scatter Plot of SV vs DV 120
1.6 4
x 10
us
110 100
1
SV
DV
1.4
Scatter Plot of SE vs SV
Bleeding Non−bleeding
1.5
1.2 Con
cr
0.4 0.8
ip t
1.6
90
an
80
0.5
Bleeding Non−bleeding
70 60
80
1.6
x 10
SV Scatter Plot of En vs SE
3
Bleeding Non−bleeding
1.4
1
0.6 0.5
pt
0.8
1
1.5
2
−3
x 10
1
1.5 SE
−4
x 10
Scatter Plot of Con vs DA Bleeding Non−bleeding
2 1.5 1 0.5 0 0.8
2.5 −3
x 10
1
1.2 Con
1.4
1.6 4
x 10
Ac ce
En
0.5
2.5
ed
1.2
0.4 0
120
DA
−4
100
M
0 60
SE
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
Figure 4: Scatter Diagrams of Con vs SE (R-channel), Con vs DV (G-channel), SV
vs DV (R-channel), SE vs SV (B-channel), En vs SE (G-channel) and Con vs DA
(B-channel) show significant variability of the feature values between the two classes.
of our feature selection stage. We conducted more experiments to find out the effectiveness of the selected six features in terms of the standard measures- accuracy, sensitivity 22
Page 23 of 70
Table 2: Performance of Individual Features
Channel
SE
SV
DV
DA
En 98.84
98.90 97.44
98.26
97.91
99.19
98.95 97.75
98.37
98.02
98.95
Specificity(%)
98.84
98.84 97.13
98.14
97.79
98.72
Accuracy(%)
99.07
98.78 98.25
97.91
97.03
97.73
99.42
98.95 98.49
98.13
97.09
97.79
Specificity(%)
98.72
98.6
98.02
97.67
96.97
97.67
Accuracy(%)
98.49
97.09 98.08
97.33
97.03
97.85
Sensitivity(%)
G
Specificity(%)
B
cr
98.72
97.21 98.14
97.56
97.44
98.14
98.26
96.98 98.02
97.09
96.63
97.56
ed
Sensitivity(%)
us
R
an
Sensitivity(%)
ip t
99.01
M
Accuracy(%)
Con
pt
and specificity. Table 2 further elucidates that the proposed set of features is a discriminatory one showing high values of accuracy, sensitivity and specificity for features taken only one at a time. The high values of the three
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
measures prove that our feature set indeed efficaciously captures the textural information we set out to exploit. The results of Table 2 also reflect the findings of the scatter plots in Fig. 4 and speak for the discriminatory capability of the six selected features.
23
Page 24 of 70
Table 3: Performance Evaluation for Various Kernel Functions
Polynomial Kernel
RBF Kernel
Accuracy (%)
99.19
98.34
96.05
Sensitivity (%)
99.41
98.49
Specificity (%)
98.95
98.2
ip t
Linear SVM
96.22
cr
95.87
us
5.4. Choice of Kernel Function and the Position Operation P
Table 3 presents the performance of the proposed method for different
an
kernel functions. In our experiments, the order of polynomial kernel d has been set to 3 and γ of RBF kernel has been set to 10. Linear SVM exhibits
M
99.19% accuracy. This result corroborates with our assumption mentioned before that a linear classifier will work better for this particular choice of feature descriptors. For further simulations, we use linear SVM classifier.
ed
As it was stated earlier in Section 2, the choice of P affects the overall performance of the algorithm. Then what is the best choice of P to achieve
pt
the highest possible accuracy? Further researches were conducted to answer this question. Table 4 gives an indication to how performance varies with P. Here, 0◦ denotes one pixel to the right, 90◦ denotes one pixel above and
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
45◦ denotes the first pixel from the center pixel in the direction of the line bisecting 0◦ and 90◦ . 135◦ refers to the pixel that lies in an angle 45◦ with the 90◦ direction. So a pixel and the pixel immediately to its right is found to be the best choice of P that should be employed in the proposed GI hemorrhage detection algorithm.
24
Page 25 of 70
Table 4: Effect of P on Classification Performance
Accuracy (%)
Sensitivity(%) Specificity(%)
0
99.19
99.41
98.95
45
95
94.71
95.29
90
96.86
97.03
96.63
135
95.32
96.4
cr
ip t
Angle (Degrees)
us
94.24
5.5. Performance Comparison
an
We denote the superpixel based method [7] by SP , the Chrominance Moment and ULBP based method [9] by CM LBP , Raw, Histogram and Ratio feature based method [6] by RHR and the Probabilistic Neural Network
M
based method [5] by P N N . The method proposed in this article is denoted by DF T N GLCM . Fig. 5 shows the accuracy, sensitivity and specificity
ed
of the proposed technique against SP, CM LBP, P N N and RHR. Here, DF T N GLCM − R means that only R-channel was used, DF T N GLCM −
pt
G implies that only G-channel was used and DF T N GLCM − B means that only B-channel was used to perform classification. While implementing the methods in performance comparison, the parameters were chosen such that
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
a particular implementation produces the best accuracy. For CM LBP the simulations were done using 5, 10, 15, 20, 25 and 30 neurons. We utilized a publicly available LBP package [33] to simulate the ULBP part of this work. The result that demonstrated the highest accuracy were picked. RHR was implemented by downsampling the images by a factor of k (k=3, 9, 17, 21, 25 and 29) as was done in the original paper. Here the best accuracy, sensitivity
25
Page 26 of 70
ip t cr us an M ed pt Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
Figure 5: Accuracy, Sensitivity and Specificity Comparison of the Proposed Method and Various State-of-the-art Methods.
26
Page 27 of 70
Table 5: Computational Cost Comparison (Seconds/frame)
CM LBP
P NN
RHR
P roposed
.66
1.61
19.6
.38
.5
ip t
Classification time
SP
and specificity of all the simulations are reported. All the algorithms were
cr
implemented using MATLAB 2013a on a computer with Intel(R) Core(TM)
i5-3470, 3.2 GHz CPU, 4 GB of RAM. The aforementioned data-set was
us
used for all the experiments for meaningful comparison. Fig. 5 presents the comparison results. Despite its simplicity, the DFT based texture descriptor
an
based classification algorithm emerged as the most successful one in terms of all the three standard measures as we can see in Fig. 5. Table 5 presents the comparison of speed of the proposed technique
M
against SP, CM LBP, P N N and RHR. Although RHR is slightly faster than the proposed method as we can see in Table 5, the former has only
ed
73.72% accuracy, 75.35% sensitivity, 72.09% specificity. Due to lack of high values of both accuracy and time, RHR is unsuitable for practical applica-
pt
tion. In addition, the low computational cost of the proposed scheme is also promising for real-time hemorrhage detection from WCE videos.
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
5.6. Discussion
In this section we provide analysis of the results of our experimentations.
Although it was hinted in the ‘Our Method’ subsection of Section 1, here we attempt to add a few more details to the following question- what previously unresolved issues were the proposed algorithm able to overcome? A few factors contributed to the high values of accuracy, sensitivity and specificity, lower value of execution time and made this work an important 27
Page 28 of 70
ip t cr
(b)
(c)
M
an
us
(a)
(e)
ed
(d)
(f)
Figure 6: (a)-(c) Endoscopic Video Frames Containing Tiny Bleeding Regions and (d)-
pt
(f) The Corresponding Magnitude Spectrums. Bleeding Regions are Indicated With Blue Contours in (a)-(c). Whole Image and Patch Based Methods Often Fail to Classify These Frames.
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
step toward fully automated GI hemorrhage detection. Firstly, for small bleeding regions the proposed method works better than whole image or patch based methods. Fig. 6 (a)-(c) show three frames containing small bleeding regions. These tiny bleeding regions span a very small number of selected patches. Therefore, they do not contribute much to the feature values, risking the chance of misclassification in a patch based approach. 28
Page 29 of 70
ip t cr us
an
Figure 7: (a) Non-bleeding Image Containing Villi and Cavity and (b) Its Magnitude Spectrum. Note the absence of diagonal lines in the magnitude spectrum that facilitates
M
proper classification.
Again, whole image based methods rely on various statistical features of the entire image. A few bleeding pixels hardly affect or alter values of whole
ed
image based statistical features resulting missing detection and hence low sensitivity. In the frequency domain, as seen from Fig. 6 (d)-(f), these bleed-
pt
ing frames are marked by the appearance of non-horizontal and non-vertical lines, making it easier and more suitable to be captured by the textural features. As a result, our method was able to solve the small bleeding region
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
problem by taking the feature extraction process altogether in the frequency domain and considering features from the whole magnitude spectrum. This also accounts for the high accuracy of the proposed scheme. Secondly, the proposed method was also able to overcome the limitation of [7] of detecting non-bleeding frames that contain fluid, intestinal villi and cavity making the hue of the image dark red. This is illustrated in Fig. 7. The non-bleeding
29
Page 30 of 70
WCE frame in Fig.7(a) often gets misclassified as bleeding due to its dark red region created by intestinal villi. However, as the intensity varies slowly from dark red to bright region, unlike Fig. 6 diagonal lines are not gener-
ip t
ated in the ±45◦ directions of the magnitude spectrum and the image is not
misclassified as bleeding. Thirdly, the efficacy of our method lies partially
cr
in the choice of a discriminatory set of features too as we can see from the
p-values in Table 1. The scatter diagrams of Fig. 4 and results of Table 2
us
also reveal that the performance of the individual features are quite well. In fact, this actually gives rise to the question- if the performance of the fea-
an
tures considered individually is good, why are we using all the six features in conjunction, instead of using only one? The reason is that using only one
M
feature will make the algorithm less robust. In other words, there will be greater variations of algorithmic performance among various data-sets, rendering the detection scheme less reliable. Furthermore, an algorithm reliant
ed
on a feature alone is likely to fail to capture the differences of bleeding and non-bleeding magnitude spectrums. Since we have employed six features to
pt
capture the differences between bleeding and non-bleeding magnitude spectrums, it is very likely that one or more feature will always be able to capture the difference mathematically irrespective of the data-set. Besides, a detec-
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
tion algorithm based on only one feature is rather ambitious and cannot ensure that it will work in real-world settings. We now explain the benefit and justification of including our proposed
feature, namely the difference average (DA). The DA is proposed with a view to capturing textural information from NGLCM. It signifies the mean value of the intensity differences of the NGLCM. It is also related to Differ-
30
Page 31 of 70
−3
x 10
Box−whisker plot of DA (R−channel)
5
ip t
4
cr
3 2
us
1 0
Bleeding
M
an
Non−bleeding
Figure 8: Box-whisker plot of DA suggests good variation of the feature in the two classes.
ed
ence Variance (DV ). DV measures the dispersion of the elements of NGLCM with respect to DA. If the values of the elements of NGLCM exhibit greater
pt
variability with respect to one another, the value of DA will be higher. The opposite scenario will cause DA to become lower. Fig. 8 illustrates the boxwhisker plot of DA on the test dataset. The non-overlapping notches (marked
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
in red) of the two box-whisker plots manifest the difference of DA values between the two classes. We have also calulated the mean and SD values of DA for the test dataset as presented in Table 6. It is seen that the mean and SD values of DA on the test dataset for the two classes are significantly different. Therefore, DA is an efficacious feature that extracts important classification related information from NGLCM. DA can also be implemented for other
31
Page 32 of 70
Table 6: Mean and Standard Deviation of DA on the test dataset
Channel
Bleeding Mean
Non-Bleeding
SD
Mean
SD
.0013
.00036526 .00074454 0.0014
G
.0013
.00036531
.000748
.0014
B
.0013
.00036556
.0007518
0.0014
(b)
pt
(a)
ed
M
an
us
cr
ip t
R
Figure 9: Two Misclassified Frames- (a) False Detection (b) Missing Detection.
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
image analysis and classification problems where NGLCM is employed. We now discuss some misclassification cases of the proposed algorithm.
Fig. 9 shows two cases of false and missing detections of our algorithm. It is found that images containing bubbles and specular reflection exhibit similar patterns as bleeding in the frequency domain. As a result, these frames can also get classified as bleeding. Again, the proposed scheme fails to detect
32
Page 33 of 70
bleeding frames if the bleeding region is completely engulfed by intestinal villi or cavity. However, compared to the thousands of images that have to be screened in one examination, these frames are too small in number to
ip t
drastically affect classification accuracy.
cr
6. Future Directions
In this section, we identify some directions of future research and the
us
possible areas that this work can be extended. Firstly, this algorithm’s computational cost can further be reduced by implementing it on CUDA based
an
Graphical Processing Unit (GPU). Secondly, the proposed approach can be implemented to solve other image classification problems such as medical
M
image classification, image retrieval, remote sensing etc. Thirdly, the proposed method can be extended for other CAD problems of WCE videos such as Crohn’s disease detection, ulcer detection, tumor detection etc. as well.
ed
Fourthly, future studies can also explore to combine multiple classifiers to enhance the algorithm’s performance. Classifier boosting can be adopted for
pt
better classification performance as well. Fifthly, how the algorithm behaves for different choices of the position operator P- may also be an interesting topic of further research.
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
7. Conclusion
In this work, the problem of computer-aided GI bleeding detection was
addressed by selecting feature descriptors from the NGLCM of the images in the frequency domain. The accuracy of the scheme is promising. The proposed algorithm was evaluated against previously published works. The 33
Page 34 of 70
results of performance comparison were also significant. The superiority of the algorithm was also confirmed by statistical hypothesis testing and graphical analysis. We can expect that the proposed method will be ideal
ip t
for practical implementation since it does not require any sort of human
intervention like selecting informative patches and therefore making WCE
cr
technology less problematic and convenient for both patients and clinicians. We thus come into a conclusion, as the experimental results suggest, the
us
devised scheme is simple, yet effective and efficient.
an
Acknowledgment
The authors would like to thank Given Imagine Ltd. for generously pro-
M
viding the WCE data (www.capsuleendoscopy.org).
ed
References
[1] G. Iddan, G. Meron, A. Glukhovsky, P. Swain, Wireless capsule en-
pt
doscopy, Nature 405 (2000) 417–417. [2] http://www.fda.gov/cdrh/mda/docs/k010312.html. [3] http://www.givenimaging.com.
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
[4] S. Liangpunsakul, L. Mays, D. K. Rex, Performance of given suspected blood indicator, Am J Gastroenterol 98 (12) (2003) 2676–2678.
[5] G. Pan, G. Yan, X. Qiu, J. Cui, Bleeding detection in wireless capsule endoscopy based on probabilistic neural network, Journal of Medical Systems 35 (6) (2011) 1477–1484. 34
Page 35 of 70
[6] J. Liu, X. Yuan, Obscure bleeding detection in endoscopy images using support vector machines, Optimization and Engineering 10 (2) (2009)
ip t
289–299. [7] Y. Fu, W. Zhang, M. Mandal, M.-H. Meng, Computer-aided bleeding detection in wce video, Biomedical and Health Informatics, IEEE Jour-
cr
nal of 18 (2) (2014) 636–642.
us
[8] B. Penna, T. Tillo, M. Grangetto, E. Magli, G. Olmo, A technique for blood detection in wireless capsule endoscopy images, in: Proc of the
an
17th European Signal Processing Conference (EUSIPCO09). Glasgow, Scotland, 2009, pp. 1864–1868.
M
[9] B. Li, M.-H. Meng, Computer-aided detection of bleeding regions for capsule endoscopy images, Biomedical Engineering, IEEE Transactions
ed
on 56 (4) (2009) 1032–1039.
[10] M. Coimbra, J. Cunha, Mpeg-7 visual descriptor’s contributions for au-
pt
tomated feature extraction in capsule endoscopy, Circuits and Systems for Video Technology, IEEE Transactions on 16 (5) (2006) 628–637. [11] R. Kumar, Q. Zhao, S. Seshamani, G. Mullin, G. Hager, T. Dassopou-
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
los, Assessment of crohn’s disease lesions in wireless capsule endoscopy images, Biomedical Engineering, IEEE Transactions on 59 (2) (2012) 355–362.
[12] B. Li, M.-H. Meng, Tumor recognition in wireless capsule endoscopy images using textural features and svm-based feature selection, Infor-
35
Page 36 of 70
mation Technology in Biomedicine, IEEE Transactions on 16 (3) (2012) 323–329.
ip t
[13] B. Giritharan, X. Yuan, J. Liu, B. Buckles, J. Oh, S. J. Tang, Bleeding detection from capsule endoscopy videos, in: Engineering in Medicine and Biology Society, 2008. EMBS 2008. 30th Annual International Con-
cr
ference of the IEEE, 2008, pp. 4780–4783.
us
[14] B. Li, M. Q.-H. Meng, J. Y. Lau, Computer-aided small bowel tumor detection for capsule endoscopy, Artificial intelligence in medicine 52 (1)
an
(2011) 11–16.
[15] M. Boulougoura, E. Wadge, V. Kodogiannis, H. S. Chowdrey, Intelligent
M
systems for computer-assisted clinical endoscopic image analysis, Acta Press.
ed
[16] S. Hwang, J. Oh, J. Cox, S. J. Tang, H. F. Tibbals, Blood detection in wireless capsule endoscopy using expectation maximization clustering,
pt
in: Proc. SPIE, Vol. 6144, 2006, pp. 1–11. [17] P. Y. Lau, P. Correia, Detection of bleeding patterns in wce video using multiple features, in: Engineering in Medicine and Biology Society, 2007.
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
EMBS 2007. 29th Annual International Conference of the IEEE, 2007, pp. 5601–5604.
[18] Scale invariant texture descriptors for classifying celiac disease, Medical Image Analysis 17 (4) (2013) 458 – 474.
36
Page 37 of 70
[19] J. Cunha, M. Coimbra, P. Campos, J. M. Soares, Automated topographic segmentation and transit time estimation in endoscopic capsule
ip t
exams, Medical Imaging, IEEE Transactions on 27 (1) (2008) 19–27. [20] M. Liedlgruber, A. Uhl, Computer-aided decision support systems for
endoscopy in the gastrointestinal tract: A review, Biomedical Engineer-
cr
ing, IEEE Reviews in 4 (2011) 73–88.
us
[21] C.-M. Pun, M.-C. Lee, Log-polar wavelet energy signatures for rotation and scale invariant texture classification, Pattern Analysis and Machine
an
Intelligence, IEEE Transactions on 25 (5) (2003) 590–603. [22] G. Liu, Z. Lin, Y. Yu, Radon representation-based feature descriptor for
M
texture classification, Image Processing, IEEE Transactions on 18 (5) (2009) 921–928.
ed
[23] X. Zhang, N. Younan, C. O’Hara, Wavelet domain statistical hyperspectral soil texture classification, Geoscience and Remote Sensing, IEEE
pt
Transactions on 43 (3) (2005) 615–618. [24] S. Karkanis, D. Iakovidis, D. Maroulis, D. Karras, M. Tzivras, Computer-aided tumor detection in endoscopic video using color wavelet
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
features, Information Technology in Biomedicine, IEEE Transactions on 7 (3) (2003) 141–152.
[25] R. C. Gonzalez, R. E. Woods, Digital Image Processing, 3rd Edition, Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 2006.
[26] T. Pappas, D. Neuhoff, H. de Ridder, J. Zujovic, Image analysis: Focus on texture similarity, Proceedings of the IEEE 101 (9) (2013) 2044–2057. 37
Page 38 of 70
[27] J. Portilla, E. P. Simoncelli, A parametric texture model based on joint statistics of complex wavelet coefficients, International Journal of Com-
ip t
puter Vision 40 (1) (2000) 49–70. [28] P. Chiranjeevi, S. Sengupta, Moving object detection in the presence of dynamic backgrounds using intensity and textural features, Journal of
cr
Electronic Imaging 20 (4) (2011) 043009–043009–11.
us
[29] R. Haralick, K. Shanmugam, I. Dinstein, Textural features for image classification, Systems, Man and Cybernetics, IEEE Transactions on
an
SMC-3 (6) (1973) 610–621.
[30] C.-L. Liu, K. Nakashima, H. Sako, H. Fujisawa, Handwritten digit recog-
M
nition: benchmarking of state-of-the-art techniques, Pattern Recognition 36 (10) (2003) 2271 – 2285.
ed
[31] L. Wang, Support Vector Machines: Theory and Applications.New York, New York:Springer-Verlag, 2005.
pt
[32] C.-C. Chang, C.-J. Lin, LIBSVM: A library for support vector machines, ACM Transactions on Intelligent Systems and Technology 2 (2011) 27:1– 27:27.
Ac ce
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
[33] http://www.cse.oulu.fi/cmv/downloads/lbpmatlab.
38
Page 39 of 70
Ac ce p
te
d
M
an
us
cr
ip t
4.eps
Page 40 of 70
Ac ce p
te
d
M
an
us
cr
ip t
5.eps
Page 41 of 70
Ac ce p
te
d
M
an
us
cr
ip t
accuracy.eps
Page 42 of 70
Ac ce p
te
d
M
an
us
cr
ip t
b1.eps
Page 43 of 70
Ac ce p
te
d
M
an
us
cr
ip t
b2.eps
Page 44 of 70
Ac ce p
te
d
M
an
us
cr
ip t
blockdiag1.eps
Page 45 of 70
−3
x 10
M
an
us
cr
ip t
cmpb_rebuttal_box1.eps
Box−whisker plot of DA (R−channel)
3 2
te Ac ce p
4
d
5
1 0 Non−bleeding
Bleeding Page 46 of 70
Ac ce p
te
d
M
an
us
cr
ip t
fp1_spect.eps
Page 47 of 70
Ac ce p
te
d
M
an
us
cr
ip t
fp2.eps
Page 48 of 70
Ac ce p
te
d
M
an
us
cr
ip t
fp2_spect.eps
Page 49 of 70
Ac ce p
te
d
M
an
us
cr
ip t
fp3_spect.eps
Page 50 of 70
Ac ce p
te
d
M
an
us
cr
ip t
illustration.eps
Page 51 of 70
Ac ce p
te
d
M
an
us
cr
ip t
n2.eps
Page 52 of 70
Ac ce p
te
d
M
an
us
cr
ip t
n4.eps
Page 53 of 70
Ac ce p
te
d
M
an
us
cr
ip t
nb1.eps
Page 54 of 70
Ac ce p
te
d
M
an
us
cr
ip t
nb2.eps
Page 55 of 70
−3
DA
2 1.5
d te
2.5
Scatter Plot of Con vs DA Bleeding Non−bleeding
Ac ce p
3
x 10
M
an
us
cr
ip t
noname.eps
1 0.5 0 0.8
1
1.2 Con
1.4
1.6
Page 56 of 70
4
x 10
−4
SE
1.2 1
d te
1.4
Scatter Plot of Con vs SE Bleeding Non−bleeding
Ac ce p
1.6
x 10
M
an
us
cr
ip t
noname1.eps
0.8 0.6 0.4 0.8
1
1.2 Con
1.4
1.6
Page 57 of 70
4
x 10
M
an
us
cr
ip t
noname2.eps
Scatter Plot of Con vs DV
DV
1
Bleeding Non−bleeding
Ac ce p
te
d
1.5
0.5
0 0.8
1
1.2 Con
1.4
1.6
Page 58 of 70
4
x 10
M
an
us
cr
ip t
noname3.eps
1
Bleeding Non−bleeding
Ac ce p
DV
1.5
te
d
Scatter Plot of SV vs DV
0.5
0 60
80
100 SV
120
Page 59 of 70
M
an
us
cr
ip t
noname4.eps
Scatter Plot of SE vs SV
SV
100 90
te
Bleeding Non−bleeding
Ac ce p
110
d
120
80 70 60
0.5
1
1.5 SE
Page 60 of 70
−4
x 10
−4
SE
1.2 1
d
Bleeding Non−bleeding
te
1.4
Scatter Plot of En vs SE
Ac ce p
1.6
x 10
M
an
us
cr
ip t
noname5.eps
0.8 0.6 0.4 0
0.5
1
1.5 En
2
2.5
Page 61 of 70
−3
x 10
Ac ce p
te
d
M
an
us
cr
ip t
sensitivity.eps
Page 62 of 70
Ac ce p
te
d
M
an
us
cr
ip t
small1_marked.eps
Page 63 of 70
Ac ce p
te
d
M
an
us
cr
ip t
small2_marked.eps
Page 64 of 70
Ac ce p
te
d
M
an
us
cr
ip t
small3_marked.eps
Page 65 of 70
Ac ce p
te
d
M
an
us
cr
ip t
specificity.eps
Page 66 of 70
Ac ce p
te
d
M
an
us
cr
ip t
tn.eps
Page 67 of 70
Ac ce p
te
d
M
an
us
cr
ip t
tn1.eps
Page 68 of 70
Ac ce p
te
d
M
an
us
cr
ip t
tn3_spect.eps
Page 69 of 70
Conflicts of Interest
Ac ce p
te
d
M
an
us
cr
ip t
We have no conflict of interest with anybody.
Page 70 of 70