Accepted Manuscript Title: A new thresholding technique based on fuzzy set as an application to leukocyte nucleus segmentation Author: V.P. Ananthi, P. Balasubramaniam PII: DOI: Reference:
S0169-2607(16)30171-7 http://dx.doi.org/doi: 10.1016/j.cmpb.2016.07.002 COMM 4188
To appear in:
Computer Methods and Programs in Biomedicine
Received date: Revised date: Accepted date:
25-2-2016 19-5-2016 1-7-2016
Please cite this article as: V.P. Ananthi, P. Balasubramaniam, A new thresholding technique based on fuzzy set as an application to leukocyte nucleus segmentation, Computer Methods and Programs in Biomedicine (2016), http://dx.doi.org/doi: 10.1016/j.cmpb.2016.07.002. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
A new thresholding technique based on fuzzy set as an application to leukocyte nucleus segmentation V. P. Ananthi†, P. Balasubramaniam †,* †Department of Mathematics, Gandhigram Rural Institute - Deemed University, Gandhigram 624 302, Tamilnadu, India.
Highlights •
A new fuzzy method is introduced to segment leukocytes in blood smear images.
•
Interval-valued intuitionistic fuzzy set is generated by minimizing ultrafuzziness.
•
Similarity between ideally thresholded and the segmented images are computed.
•
Best threshold is obtained by maximizing similarity.
•
Experimentally proven that the proposed method is better than other fuzzy methods.
Abstract Background and Objectives: The main aim of this paper is to segment leukocytes in blood smear images using interval-valued intuitionistic fuzzy sets (IVIFSs). Generally, uncertainties occur in terms of vagueness through brightness levels of image. Processing of such uncertain images can be efficiently handled by using fuzzy sets, particularly IVIFSs. Methods: Logarithmic membership function is utilized for computing membership values corresponding to intensities of the pixel. Non-membership function of IVIFS is constructed by using Yager generating function. By varying parameters, 256 IVIFSs are generated. An IVIFS is selected from 256 IVIFSs having maximizing ultrafuzziness along with varying threshold. Threshold is determined by finding an IVIFS with maximum similarity between ideal segmented and segmented results obtained from the proposed method. *Corresponding author. Tel.:+91-451-2452323 Email addresses:
[email protected] (V. P. Ananthi†),
[email protected] (P. Balasubramaniam†) This work was supported by UGC-BSR-Research fellowship in Mathematical Sciences – 2013-2014.
Page 1 of 23
Results: Quantitatively, the segmented images are evaluated using precision-recall, receiver operator characteristic curves, Jaccard coefficient and measure for structural similarity index along with the time taken for segmenting nucleus and their results are compared with results of existing methods. Performance measures reveal that the proposed method seems to segment leukocytes better than other comparable methods. Conclusions: Segmentation of leukocytes using the proposed method helps the analyst in differentiating various types of leukocytes and in the determination of leukocyte count and the counting is essential in finding out diseases related to reduction or surplus quantity of these cells. Keywords: Hesitation degree, Similarity measure, Threshold, Ultrafuzziness, Fuzzy set, Leukocytes 2000 MSC: 94A08, 03F55.
1.
Introduction Segmentation is crucial in the field of computer vision. Various algorithms are available
in literature for segmentation of micro array images [1], breast images [2, 3] and skin images [4] and so on. Immune system protects the body from harmful bacteria, viruses and other pathogens by finding and eliminating them. Leukocytes are the primary constituent in the immune system and play a vital role in identifying and destroying while there is a pathogenic intrusion into the body. These cells are originated from a multi-potent cell in bone marrow. Leukocytes are majorally divided into two types namely granulocytes (neutrophils, eosinophils and basophils) and agranulocyte (lymphocytes and monocytes) as shown in Figure 1. Naturally, leukocytes occupy certain range in blood stream, namely: lymphocytes, monocytes, eosinophils, basophils and neutrophils represent respectively less than 20-35%, 3-9%, 5%, 1% and 50-70% of all leukocytes. Each class of leukocytes must be in a normal range in a differential leukocyte report. If the amount of leukocytes exceed or beneath such range, then health issues may occur. Hence counting of leukocytes is employed as an identity for finding out disease and it varies with respect to the age of each person. Counting of leukocytes is tedious and time consuming job for pathologist. Moreover, analyzing leukocytes by pathologist purely relay
Page 2 of 23
on the knowledge, eyesight and strength. Hence it is essential to have an automatic system to detect and classify types of leukocytes [5]. Currently available automated cell counters are based on laser light scattering and flow-cytochemical principles, yet 21% of all processed blood samples still require microscopic review by experts [6]. Hence, numerous different efforts [6, 7, 8, 9, 10] have been developed for automatic cell analysis using image processing. Blood cell images consist of both white and red blood cells scattered across the entire image, however, it is the white blood cells (WBCs) that provide the important information for patient diagnoses, such as leukemia or cancer [7]. In the case of WBC segmentation, an important task is the extraction of WBCs from a complicated background and then segmenting them as the nuclei and cytoplasm. Due to pale and lucid natures of leukocytes, they cannot be seen clearly through microscope until they are stained. For instance, a mixture of methylene blue and eosin based stains are utilized for staining blood smear images. The effect of staining depends upon time for staining, temperature and concentration of solution. So, one cannot assure that the blood smear images produced in the single laboratory and exploited by the same pathologist are exactly alike. To overcome this problem, image can be analyzed using various rigorous algorithms for accurate, effective identification and classification of normal and abnormal cells. Computer aided techniques are broadly employed in medical diagnosis from the last decades. In recent years, many researchers [5, 11, 12, 13, 14, 15, 16, 17] concentrated in the formulation of an automated system for identifying and classifying leukocytes. Accurate segmentation of leukocytes is yet an unrevealed problem. There exists some challenges in accurate segmentation of leukocytes. Several segmentation techniques were introduced by various authors to classify different kind of images and few of them worked on pathological images [18]. Huang et al. [5] introduced a computerized recognition method for classifying five types of leukocytes. Count of leukocytes in blood cells is essential to detect diseases like leukemia, parasitic fever and so on. Detection of such diseases have been done by employing various methods, some of them are based on morphological analysis, color analysis [14], clustering [15] and multi-spectral technique [19]. Yang et al. [16] segmented leukocytes based on components of color spaces. Texture based recognition approach had been introduced in 2004 to recognize types of leukocytes [20]. Neural networks and fuzzy logic approach have been implemented to segment leukocytes in [21].
Page 3 of 23
Simulating visual attention has been utilized in [22] to identify leukocytes. Segmentation and counting of lobes of nucleus of leukocytes have been introduced in 2010 by Chan et al. [23]. They found that the lobes increase when there is vitamin B12 deficiency and leukemia. In this paper a new automatic segmentation method is introduced based on interval-valued intuitionistic fuzzy similarity measure. Initially blood smear images in RGB color model is converted into HSI color model since S-channel in HSI space efficiently shows leukocytes in blood smear images [5]. Then contrast enhanced S-channel is thresholded with T [0,255] . For each T, three regions are divided according to mean of the background and
object at the threshold T. Upper and lower membership functions of IVIFS are defined with one free variable [0,255]
and for each σ, ultrafuzziness of the IVIFS is determined. An
appropriate σ for an image is found by maximizing ultrafuzziness. Upper and lower membership degrees of IVIFS are calculated with this σ and their corresponding non-membership values are determined by using Yager generating function. Then the similarity between ideally thresholded image and the segmented image using each threshold T are calculated. Finally, an optimal threshold is identified with maximum similarity. Half of the optimal threshold value segments the image into meaningful resions. The organization of the paper is described as follows. Section 2 briefly discusses some works related to the present study and essentiality of the proposed method. Basic ideas about fuzzy sets and their extended sets are provided in Section 3. Section 4 renders the detailed description of methodology of segmentation based on IVIFS. Quantitative metrics are presented in section 5, which are useful in evaluating effectiveness of the proposed method over other existing methods. Section 6 renders experimental results and their performance based on evaluation metrics. Conclusion is drawn along with future directions in section 7. 2.
Related works and the present study In this section, some works related to the segmentation of white blood cells are provided
based on the category of segmentation. 2.1.
Color/gray level based works A color gradient vector flow (GVF) active contour model has been proposed in 2005 for
WBC segmentation in which a color gradient and L2E robust estimation are incorporated into the traditional GVF snake in Luv color space to segment both nuclei and cytoplasm [24]. Segmentation performance has been comparatively proven to be good than a mean-shift
Page 4 of 23
approach and the traditional color GVF snake. But, GVF model fails to distinguish weak textures while segmentation. In [25], an automatic thresholding using a green color component has been introduced to provide a high contrast between the nucleus and cytoplasm. Basic morphological process has been utilized to remove left noise in the image. The nuclei and cytoplasm have been segmented using an adaptive contour. However, some WBCs (neutrophils and eosinophils) can have more than one nucleus in a cell, making it difficult to discriminate accurately between the cytoplasm and the nuclei. A watershed transform based on connectivity for the segmentation of nuclei has been proposed in [26], where an image forest transform had been used. Cytoplasm has been separated from the background and RBCs by finding size distribution, in which a series of morphological openings were performed using structuring elements of increasing size. Problems have been encountered when the shapes of the cytoplasm are not round and the sizes change according to the types of WBCs. In [8], a nuclei and cytoplasm segmentation method has been introduced for gray images using a GVF snake and Zack thresholding algorithm [27]. First, sub-images has been manually selected and a canny-edge detector applied to identify nuclei using a GVF snake in a gray level image. After subtracting the nuclei, the cytoplasm had been segmented from the rest of the sub-image using a Zack thresholding algorithm. However, manual selection is still required for the sub-images, and distinguishing the cytoplasm from the nucleus is difficult when a cell includes multiple nuclei. Furthermore, since RBCs often have a similar intensity to cytoplasm, the use of a gray-level histogram is not efficient for segmenting cytoplasm from RBCs. This study also did not include a performance comparison with other methods and was only evaluated on 20 test data. Leukocyte recognition using fuzzy divergence and modified thresholding techniques have been presented in [11], where the nuclei were segmented by applying Gamma, Gaussian, and Cauchy-type fuzzy membership functions to the image pixels. 2.2.
Clustering based works In [28], two feature space clustering techniques: scale-space filtering and watershed
clustering have been utilized for WBC segmentation. In this scheme, nuclei have been extracted using scale-space filtering from a sub-image, while watershed clustering in a 3-D HSV histogram
Page 5 of 23
has been used to extract the cytoplasm. Finally, morphological operations have been performed to obtain all the connected WBC regions. While this method is effective for WBCs with simple cytoplasm, such as lymphocytes and monocytes, the watershed clustering has been limited in the case of complex WBC categories, such as basophils, eosinophils, and neutrophils, which have complex textures, thick granules, and a similar color to RBCs. A clustering of WBC components using a modified fuzzy c-mean (FCM) clustering technique has been newly proposed in [29]. The modification in the traditional FCM was based on an iteration of false scattering color replacement by using a neighboring color data. However, manual cropping has been required for the test images, and a performance comparison with other methods was not evaluated. 2.3.
Neural Network based works In [30], WBC image segmentation scheme has been implemented by on-line trained
neural network. In this scheme, mean-shift and uniform sampling have been used as an initialization tool to largely reduce the train set. In addition, particle swarm optimization has been adopted to train the neural network for fast convergence. While the WBC segmentation accuracy and running time were improved when compared with traditional training method, it fails to distinguish nucleus from cytoplasm. Threshold segmentation has been demonstrated in [31] which includes mathematical morphology and fuzzy cellular neural networks for WBC detection. However, despite fast running speed and good detection result, it is unable to distinguish nucleus from cytoplasm. 2.4.
Necessity of the proposed method Threshold segmentation using divergence measure based on exponential intuitionistic
fuzzy entropy has been introduced in [17] to segment leukocyte nucleus. Initially JATI method extracts region of interest (ROIs) using bounding box. Exponential membership function has been utilized to reduce noise in the image during segmentation. This method iteratively increases threshold value till its maximum gray level and find thresholds for two regions with minimum divergence measure. Since the value of σ was fixed, which may not be appropriate if the method is applied to raw RGB image. That is, it may work efficiently on ROIs not on the raw RGB images. Also the membership function maps the intensity of ROIs into a constant function, which creates hesitation whether the mapped intensity is appropriate or not. Therefore, the main contribution of the present study is an improved segmentation accuracy of Jati et al [17] method
Page 6 of 23
for segmenting a raw RGB image by using interval of membership values instead of a constant value. Also in [13], WBC has been segmented based on gradient vector flow snake model. Here, authors segmented WBC by cropping raw WBC image into sub-image using probability map, which contained a single WBC. If the method in [13] is applied directly on raw WBC image, then it shows some over segmentation parts. Also the selection of initial mask is tedious for multiple ROIs in an single image. But, the proposed method based on IVIFS can be applied directly on the raw image without any ROI disintegration. 3.
Preliminary concepts on fuzzy set theory Generally, fuzzy sets (FSs) or Type I FSs are applied to deal with imprecise and uncertain
data. Fuzzy sets contain membership and non-membership degrees. Here, non-membership value is the complement of the membership value. Membership function is user defined and it may be based on triangular, Gaussian, Gamma, exponential, logarithmic and restricted equivalence functions. Thus, one cannot assure that which membership function appropriately portrays an image. In order to eliminate hesitation in the selection of membership function, hesitation degree has been introduced in intuitionistic fuzzy sets (IFSs) (see [32]). Segmentation of leukocytes in blood smear images has been studied in [33] by using intuitionistic fuzzy divergence measure. Jati et al. [17] used IFS to segregate nucleus of leukocytes in blood cells and shown that their method produced better result than Otsu’s and other fuzzy based methods. Rather, if the user is not sure that the membership values generated by the chosen membership function perfectly describe the greyness of the image, then uncertainty arises in the membership of FS. This can be solved by using Type II FSs (T2FS) [34]. Tizhoosh [35] segmented images by using T2FS based in thresholding. Chaira [33] used T2FSs to segment leukocytes in blood cells by employing divergence measure. But, after the introduction of interval-valued intuitionistic fuzzy sets (IVIFSs) by Atanassov [36], Bustince and Burillo [37] showed that the uncertainty in the membership of FSs can be eliminated by using IVIFS, in which membership is an interval rather than an exact number. Before describing methodology of the proposed method, let us briefly discuss about the representation of digital images in various fuzzy domain as follows. Fuzzy set theory is initially introduced to deal vagueness. Let Z = { z1 , z 2 , , z n } be a finite set. A fuzzy set FI of Z is defined as F I = {( z , F ( z )) | z Z }, I
Page 7 of 23
where F ( z ) : Z [0,1] denotes the degree of membership of an element z in Z. The I
non-membership of z is the complement of F , that is, 1 F . I
I
Fuzzy singleton is a fuzzy set whose support is a single point. Based on FSs, an image of size M × N is considered as an M × N array of fuzzy singletons. FSs have the capability to absorb uncertainty. Uncertainty or vagueness exists in images while defining image properties like brightness, edges and so on. Such vagueness occurs during acquisition of blood smear images by a digital microscopes due to poor illumination. Hence, brightness value of the image pixels is uncertain. It should be noted that the membership functions are chosen intuitively from the choice of membership function by the user. Primary problem faced by FS is the assignment of appropriate value to the uncertain pixel by using the selected membership function. In order to get more robust set to remove the uncertainty left by Type I FS, T2FS may be implemented [35]. T2FS F II of Z is defined as F II = {(( z , x ), F ( z , x )) | z Z , x [0,1]}, II
here x = F ( z )) [0,1] is the primary membership function of z and F ( z , x ) : [0,1] [0,1] I
II
is the secondary membership function to the particular pair (z, x). But there exists a hesitation of the user, while selecting the membership function for an image from the choice of membership function, which appropriately defines the image without uncertainty. These types of hesitation are eliminated by the introduction of intuitionistic fuzzy index in IFS [32]. IFS F of Z is defined as F = {( z , F ( z ), F ( z ), ( z )) | z Z },
where F ( z ), F ( z ) : Z [0,1] denote the degree of membership and non-membership of an element z in Z, respectively with the essential condition that F ( z ) F ( z ) ( z ) = 1 . But Bustince and Burillo [37] have mentioned that the usage of interval of membership values instead of an exact membership value is more realistic. This can be achieved by utilizing IVIFSs. An IVIFS F = {( z , M
F
F
in Z is expressed as ( z ), N
F
( z )) | z Z },
Page 8 of 23
where M
F
,N
F
: Z [0,1]
condition that 0 MU
F
are the membership and non-membership intervals satisfying the
NU
F
1.
Here MU
F
and NU
F
denote upper limits of
membership and non-membership function respectively. 4.
Segmentation of leukocyte nucleus based on IVIFSs The way of finding a threshold for segmentation of leukocyte nucleus using IVIFS along
with some preprocessing tasks is described in the following sequel. 4.1.
Blood samples preparation Due to pale and vivid nature of leukocytes, it is necessary to stain them. Usually they are
stained using methylene blue and eosin. After taking blood from a person, it should be stained within an hour otherwise blood starts coagulating. Suppose preparation of thin film within an hour is not possible, then ethylenediamine tetra-acetate (EDTA) dipotassium salt solution is added to anticoagulate blood. 4.2.
Acquisition of microscopic image Olympus CH20i Microscope is utilized to capture blood smear images under 100 × oil
objective (abbe condenser N.A. 1.25 (oil immersion), with aperture iris diaphragm). A few original blood smear images are shown in Figure 2. 4.3.
Noise reduction Noise may occur in digital images during acquisition or transmission through electric
channels. Segmentation of noisy biomedical images produces inaccurate segmentation results and this leads to severe issues. This paper utilizes adaptive median filter for removing noise in the blood smear images as in [38]. 4.4.
Color space conversion Main intention for utilizing color space is to alleviate colors [39]. RGB space/model is
the most commonly used color model in many electronic devices. In RGB model, green channel shows nuclei of leukocytes well, but this is not happened for all blood smear images [5]. So, this paper opts HSI color model. HSI color model speculates the human visual system. In HSI color space, H, S and I denote hue, saturation and intensity. S-channel of the HSI image clearly shows the leukocyte nuclei than the other two channels of HSI. S-channels of the original blood smear images (shown in Figure 2) in HSI space are shown in Figure 3. 4.5.
Enhancement of S-channel
Page 9 of 23
Usually biomedical images are poorly illuminated due to various experimental conditions [40]. One of the significant issues during segmentation of leukocyte is poor contrast. Hence, further processing can be done after enhancing S-channel of the HSI image. This paper utilizes gamma equalization (R) for enhancing the contrast of the S-channel. R of the image is done by using the following equation R ij = S max
S ij S min S max S min
,
(1)
where γ ∈ (0, 1), S m ax and S min denote maximum and minimum saturation value of the image S respectively. Following Figure 4 shows the enhanced S-channel of blood smear images given in Figure 3. 4.6.
Proposed methodology to find an appropriate threshold Usually, blood smear image comprises background, leukocytes, erythrocytes and
platelets. Stained blood smear image can be divided into three region according to their intensity namely background, leukocytes constitute two separate regions and erythrocytes and platelets constitute a single region. In order to divide three regions, two thresholds are necessary to segment the image [17]. Selecting two thresholds for an image randomly among the intensity range is tedious than fixing a single threshold and it is also time consuming work. Three regions in the image are divided by finding the regions between mean of the background and foreground of the image at the threshold. Basic idea behind the proposed method is to find a threshold T for which interval-valued intuitionistic fuzzy similarity between ideally segmented image and the segmented image using the determined threshold is maximum. Proposed algorithm to determine nucleus of leukocytes is briefly described as follows. Flowchart of the proposed method is given in Figure 5. Step 1: In order to find optimal threshold T, single outer loop is required for T varying from 0 to 255. Step 2: For the specific value of T, the S-channel (R) of the blood smear image of size M × N in HSI image is divide into three regions r1, r2, r3 as
Page 10 of 23
r1 , R = r2 , r , 3
if 0 R ij < m b (T ) ; if m b (T ) R ij < m o (T );
(2)
if m o (T ) R ij 255.
where R ij denotes the saturation value of ( i , j ) th pixel of the image R,
mo (T ) =
255
R ij h ( R ij )
R = T 1 ij
255
and mb (T ) =
T R =0 ij
h ( R ij )
R = T 1 ij
R ij h ( R ij )
T R =0 ij
are the mean of the object and the h ( R ij )
background at the threshold T. Here h ( R ij ) counts the total number of pixels in the image R with saturation value R ij . Step 3: In order to compute membership function with variable parameter [0,255] , a single internal loop is required for σ varying from 0 to 255. Step 4: Mean mk of saturation values for each k = 1, 2, 3 of three regions rk is calculated by using the equation
m ax ( R
ij
) r k
R = m in ( R ij ij
mk =
m ax ( R
ij
)
R ij h ( R ij )
r k
, (3)
) r k
R = m in ( R ij ij
)
h ( R ij )
r k
where min ( R ij ) and max ( R ij ) denote minimum and maximum saturation value of R ij in r k
r k
the region rk for each k = 1, 2, 3. Step 5: The membership value ( R ij , ) of IFS for each pixel at the location (i, j) with [0,255]
is computed using Logarithmic membership function given by
| R ij m k ( R ij , ) = exp log (2)
|
2
(4)
where mk represent mean of the regions k = r1 , r2 , r3 , which are obtained from Step 4. Step 6:
Page 11 of 23
Compute lower and upper membership function MU and ML of IVIFS
F for each σ
by ML ( R ij , ) = ( R ij , )
1/
,
(5)
MU ( R ij , ) = ( R ij , ) ,
(6)
where R ij is the ( i , j ) th saturation value of the image pixel and α ∈ (0, 1). Step 7: Compute ultrafuzziness of IVIFS ( F
)=
1 MN
h(R R
ij
F , for each σ based on the following equation
) ( MU ( R ij , ) ML ( R ij , )).
(7)
ij
Step 8: Find a IVIFS
F with maximum ultrafuzziness. Choose the σ that corresponds to the
set with maximum ultrafuzziness and fix for that image. Step 9: Compute MU and ML of IVIFS
F
with that fixed σ.
Step 10: Estimate lower and non-membership degrees MU and ML of IVIFS using Yager generating function [41] as mentioned below
1/
1/
NL ( R ij ) = (1 MU ( R ij ) ) NU ( R ij ) = (1 ML ( R ij ) )
(8) (9)
where β ∈ (0, 1). As σ is fixed, it is not included in the membership degrees MU and ML. Step 11: Estimate similarity S between the image (R) thresholded with T and the ideally thresholded image I using the measure provided below
S (I , R) =
1 MN
MN
i =1
2 min {| ML I ML R |, | NL I NL R |} min {| MU 2 max {| ML I ML R |, | NL I NL R |} max {| MU
MU
I I
MU
R R
|, | NU I NU |, | NU I NU
R R
|}
.
|}
(10)
Page 12 of 23
Since the ideal segmented image I have the membership value 1 and the non-membership value is 0, hence the above measure can be rewritten as S (I , R) =
MN
1 MN
2 min {| 1 ML R |, NL R } min {| 1 MU
2 max {| 1 ML i =1
R
|, NL R } max {| 1 MU
R R
|, NU R }
.
|, NU R }
(11)
Step 12: Check stopping criteria: If T 255, then take T = T + 1 and proceed from step 2; otherwise end the process. Step 13: Find the best threshold value T for which the similarity is maximum. Step 14: Finally segment the image R with half of the threshold T to segment nucleus of leukocytes. 5.
Quantitative measures Evaluation metrics are used to access the performance of segmentation algorithm [42]
and one of the commonly used metric is accuracy. Various performance metrics are also available in literature. 5.1.
Accuracy Accuracy is a metric normally used for recalling the overall classification grade and is
computed as Accuracy
=
tp tn tp tn fp fn
,
where tp (true positive) is the number of positive classes correctly classified as positive, tn (true negative) is the number of negative classes correctly classified as negative, fp (false positive) is the number of negative classes incorrectly classified as positive and fn (false negative) is the number of positive classes incorrectly classified as negative. 5.2.
Precision It computes the percentage of positive prediction made by the classifier that are correct. Precision
5.3.
=
tp tp fp
Recall
Page 13 of 23
It computes the percentage of positive patterns that are correctly detected by the classifier and is mathematically defined as Recall =
5.4.
tp tp fn
.
Precision-Recall and ROC curves Precision-Recall curves show the relationship among precision and recall as segmentation
threshold (cut-off limits) varies. The receiver operating characteristic (ROC) is a graph, which renders the relationship between true positive rate (TPR) and false positive rate (FPR) along with varying threshold (cut-off limits). For a classifier, it depicts that TPR cannot increase without an increasing in FPR, where TPR = Recall , fp
FPR =
.
Total negative
5.5.
Dice coefficient Dice similarity coefficient is used to show the similarity level of extracted leukocyte
nucleus region with respect to the manually segmented leukocyte nucleus region [43]. It is mathematically formulated as Dice ( A , B ) =
2 | A B | | A||B|
,
where A is manually segmented leukocyte nucleus region and B is the extracted leukocyte nucleus region obtained using the proposed method. If the Dice coefficient value is 1, then it shows the perfect overlap between A and B. Else if its value is 0, then there is no overlap between A and B. 5.6.
Jaccard coefficient Jaccard coefficient [44] between ground truth of leukocyte nucleus region A and the
segmented leukocyte nucleus region B by the proposed method is given by Jaccard ( A , B ) =
5.7.
| A B | | A B |
.
Measure for structural similarity
Page 14 of 23
Another form of evaluating the quality of the image is to use SSIM. SSIM is developed using the structure, luminance and contrast, which is defined as SSIM =
(2 A B K 1 )(2 ( K 1 )( 2 A
2 B
2 A
AB
K2)
B K2) 2
,
where A = B =
A = 2
B = 2
AB =
1 MN 1 MN 1 MN 1 MN
M
N
i =1
j =1
M
N
i =1
j =1
B ij ,
1 M
N
i =1
j =1
1 M
N
i =1
j =1
1 MN
Aij ,
1
( Aij A ) , 2
( B ij B ) , 2
M
N
i =1
j =1
( Aij A )( B ij B )
and
K1 and K2 are constants which is affixed to withstand the stability whenever A2 B2 and 2 2 A B approaching zero.
6.
Experimental results and discussion Experimental results are executed on database consisted of 370 color peripheral blood
images, including 100 images with one WBC collected from cellavision reference library (http://www.cellavision.com), 270 images consist of 100 images with one WBC and remaining 170 images with more than two WBCs collected from Sathiya Sai Histopath & Diagnostic centre, Tamil Nadu. The WBCs from Cellavision were already classified by experts within the field of hematology, and noted that the data set consisted of hundred 640 × 480 of each type of stained WBCs: neutrophils, eosinophils, basophils, monocytes and lymphocytes. Meanwhile, the 270 images from Sathiya Sai Histopath & Diagnostic centre also include all five types of stained WBCs, were 320 × 240 in size, and were taken from peripheral blood smears using a microscope, charge-coupled camera and 24-bit digitizer. Cellavision data contain one WBC per one image and Sathiya Sai Histopath & Diagnostic centre data contain from one to four WBCs per one image. Therefore, the total number of WBC for the performance test consists a minimum of 540. WBCs are extracted manually by cropping the nuclei from each WBC image using a
Page 15 of 23
graphic tool by trained and experienced pathologists. Let D1 and D2, respectively denote the dataset from Cellavision and Sathiya Sai Histopath & Diagnostic centre. Figure 2 shows the images chosen from the two sources Cellavision and Sathiya Sai Histopath & Diagnostic centre for validating the performance of the proposed method. In order to evaluate efficiency of the proposed algorithm, quantitative measures such as accuracy, SSIM, Dice and Jaccard coefficient, Precision-Recall & ROC curves are computed along with the time for computation in seconds. Segmented results obtained by the proposed method (IVIFS) are compared with other existing IFS and T2FS based methods [33], fuzzy method (JATI) [17], K-Means (KM) [15] and GVF [13] (without creating sub-image using probability map and applying this method directly on raw image). Ghosh et al. [11] proposed a method and proved that the obtained result is better than Otsu’s method [45] and normal fuzzy method given by Chaira [46]. So in this paper, the results of Otsu and Chaira method is not considered, it simply compares Ghosh method. Similarly, Jati et al. in [17] has already proven that the developed method is better than the segmentation based on simulated visual attention [22], multi-spectral imaging technique [19] and Gram-Schmidt orthogonalization based leukocyte segmentation [12]. Hence, it is enough to compare JATI method to show betterment of the proposed method rather comparing all other methods. Images of segmented leukocyte nucleus acquired from various methods along with the proposed method are shown in Figure 6. First column of Figure 6 shows names of blood smear images belonging to the dataset D1 and D2. Second to seventh columns of Figure 6 show the segmentation results obtained from IFS, T2FS, JATI, KM, GVF and the proposed method (IVIFS) respectively. Qualitatively it clearly seen from Figure 6 that the proposed method segments leukocyte nucleus in a better way than other comparable methods. Accuracy of the segmentation results are calculated for all the images in the two datasets separately and their average values are given in Table 1. Accuracy of the outputs obtained by using the T2FS technique showed higher performance than IFS based technique for both the datasets. But JATI method shows lower performance than other methods for the dataset D1 and it shows higher performance when compared to IFS, T2FS and KM methods for the dataset D2. Accuracy of KM algorithm seems to be higher than IFS, T2FS and and JATI methods for the dataset D1 but comparably less for the dataset D2. GVF method renders a relatively close accuracy rate when compared to IVIFS method for the dataset D1 but its accuracy is low for the
Page 16 of 23
dataset D2, this clearly explains that GVF method can work well on images with one WBC and not on the images with more than one WBCs. Accuracy values of IVIFS based thresholding for both the datasets are more efficient than other comparable methods. This eloquently shows that the proposed method is better than the other existing methods. Precision-Recall and ROC curves for the segmented blood images of the two datasets D1 and D2 are depicted in Figures 7 and 8 respectively. Both the graphs vividly describe the effectiveness of the proposed method in all aspects than other existing methods. Table 2 provides the average values of Dice coefficient for various segmentation outputs are obtained by the proposed algorithms along with other existing algorithms for the datasets D1 and D2. From the table, it is clear that the proposed method has highest Dice coefficient values than other existing methods. Average Jaccard coefficient of various segmented outputs are shown in Figure 9 for the two datasets D1 and D2. Obviously Figure 9 vividly explains the results obtained from the proposed method shows better performance than IFS, T2FS, JATI, KM and GVF methods. Similarly, SSIM values are evaluated for the segmented images of both the datasets obtained by the proposed method along with other existing methods and their average values are given in the Table 3. SSIM values of IFS, T2FS, JATI, KM and GVF methods are comparably lower than the proposed method due to less absorbability of uncertainty by these methods when compared to IVIFS method. Time computation for the segmentation of nucleus of leukocytes in both the datasets by IVIFS method along other comparable methods are estimated and are picturised in Figure 10. From this figure it is clear that the proposed method requires less time comparable to IFS, T2FS, JATI, KM and GVF methods. Therefore, these measures quantitatively prove that the proposed method is better in the segmentation of nucleus of leukocytes than other existing techniques. 7.
Conclusion In this paper a new threshold based segmentation technique based on IVIFS is introduced
to deal the problem of choosing the values of membership function to symbolize imprecise data. An IVIFS is selected from 256 IVIFSs having maximizing ultrafuzziness along with varying threshold. Then a threshold is chosen by finding a fuzzy set with maximum similarity. Then the image is segmented with the determined threshold. Experimental results provide the effectiveness of the proposed method than other methods existing in the literature. Basic reason
Page 17 of 23
for getting more appropriate results is that IVIFS absorbs uncertainties that are left by FS, IFS, T2FSs, KM and GVF methods. Recently, GVF snake has been utilized to segment nucleus of leukocytes automatically by using stepwise merging rules. But in that method it is tedious to determine initial contour for all the images (images with multiple WBCs) without separating ROIs. So, our future work is to implement fuzzy thresholding to find inital contour for gradient vector flow snake method without creating sub-images from raw blood cell images for segmenting nucleus of leukocytes. Acknowledgements This work was supported by UGC-BSR-Research fellowship in Mathematical Sciences – 2013-2014. Authors would like to acknowledge Dr. D. Sathiya Bama, Consultant Pathologist, Sathiya Sai Histopath & Diadonistic Centre, Erode, Tamil Nadu, India for providing the pheripheral blood smear images and for giving the qualitative validation. The authors wish to thank all the referees for their constructive comments and suggestions which resulted in presenting this paper in a clear and precise manner.
References [1] N. Giannakeas, P. S. Karvelis, T. P. Exarchos, F. G. Kalatzis, D. I. Fotiadis, Segmentation of microarray images using pixel classification comparison with clustering-based methods, Computers in biology and medicine 43 (6) (2013) 705–716. [2] W. He, P. Hogg, A. Juette, E. R. Denton, R. Zwiggelaar, Breast image pre-processing for mammographic tissue segmentation, Computer Methods and Programs in Biomedicine 67 (2015) 61–73. [3] C.-M. Lo, Y.-C. Lai, Y.-H. Chou, R.-F. Chang, Quantitative breast lesion classification based on multichannel distributions in shear-wave imaging, Computer Methods and Programs in Biomedicine 122 (3) (2015) 354–361. [4] V. K. Shrivastava, N. D. Londhe, R. S. Sonawane, J. S. Suri, Computer-aided diagnosis of psoriasis skin images with hos, texture and color features: A first comparative study of its kind, Computer Methods and Programs in Biomedicine. [5] D.-C. Huang, K.-D. Hung, Y.-K. Chan, A computer assisted method for leukocyte nucleus segmentation and recognition in blood smear images, Journal of Systems and Software 85 (9) (2012) 2104–2118.
Page 18 of 23
[6] H. Ceelie, R. Dinkelaar, W. van Gelder, Examination of peripheral blood films using automated microscopy; evaluation of diffmaster octavia and cellavision dm96, Journal of Clinical Pathology 60 (1) (2007) 72–79. [7] N. Theera-Umpon, S. Dhompongsa, Morphological granulometric features of nucleus in automatic bone marrow white blood cell classification, IEEE Transactions on Information Technology in Biomedicine, 11 (3) (2007) 353–359. [8] F. Sadeghian, Z. Seman, A. R. Ramli, B. H. A. Kahar, M.-I. Saripan, A framework for white blood cell segmentation in microscopic blood images using digital image processing, Biological Procedures Online 11 (1) (2009) 196. [9] B. Swolin, P. Simonsson, S. Backman, I. Löfqvist, I. Bredin, M. Johnsson, Differential counting of blood leukocytes using automated microscopy and a decision support system based on artificial neural networks–evaluation of diffmastertm octavia, Clinical & Laboratory Haematology 25 (3) (2003) 139–147. [10] A. Kratz, H.-I. Bengtsson, J. E. Casey, J. M. Keefe, G. H. Beatrice, D. Y. Grzybek, K. B. Lewandrowski, E. M. Van Cott, Performance evaluation of the cellavision dm96 system, American Journal of Clinical Pathology 124 (5) (2005) 770–781. [11] M. Ghosh, D. Das, C. Chakraborty, A. K. Ray, Automated leukocyte recognition using fuzzy divergence, Micron 41 (7) (2010) 840–846. [12] S. H. Rezatofighi, H. Soltanian-Zadeh, Automatic recognition of five types of white blood cells in peripheral blood, Computerized Medical Imaging and Graphics 35 (4) (2011) 333–343. [13] B. C. Ko, J.-W. Gim, J.-Y. Nam, Automatic white blood cell segmentation using stepwise merging rules and gradient vector flow snake, Micron 42 (7) (2011) 695–705. [14] S. N. Fathima, Classification of blood types by microscope color images, International Journal of Machine Learning and Computing 3 (4) (2013) 376. [15] C. Zhang, X. Xiao, X. Li, Y.-J. Chen, W. Zhen, J. Chang, C. Zheng, Z. Liu, White blood cell segmentation by color-space-based k-means clustering, Sensors 14 (9) (2014) 16128–16147. [16] Y. Yang, Y. Cao, W. Shi, A method of leukocyte segmentation based on s component and b component images, Journal of Innovative Optical Health Sciences 7 (01) (2014) 1450007. [17] A. Jati, G. Singh, R. Mukherjee, M. Ghosh, A. Konar, C. Chakraborty, A. K. Nagar, Automatic leukocyte nucleus segmentation by intuitionistic fuzzy divergence based thresholding, Micron 58 (2014) 55–65.
Page 19 of 23
[18] A. Singh, M. K. Dutta, M. ParthaSarathi, V. Uher, R. Burget, Image processing based automatic diagnosis of glaucoma using wavelet features of segmented optic disc from fundus image, Computer Methods and Programs in Biomedicine. [19] N. Guo, L. Zeng, Q. Wu, A method based on multispectral imaging technique for white blood cell segmentation, Computers in Biology and Medicine 37 (1) (2007) 70–76. [20] D. M. U. Sabino, L. da Fontoura Costa, E. G. Rizzatti, M. A. Zago, A texture approach to leukocyte recognition, Real-Time Imaging 10 (4) (2004) 205–216. [21] W. Shitong, K. F. Chung, F. Duan, Applying the improved fuzzy cellular neural network ifcnn to white blood cell detection, Neurocomputing 70 (7) (2007) 1348–1359. [22] C. Pan, D. S. Park, S. Yoon, J. C. Yang, Leukocyte image segmentation using simulated visual attention, Expert Systems with Applications 39 (8) (2012) 7479–7494. [23] Y.-K. Chan, M.-H. Tsai, D.-C. Huang, Z.-H. Zheng, K.-D. Hung, Leukocyte nucleus segmentation and nucleus lobe counting, BMC Bioinformatics 11 (1) (2010) 558. [24] L. Yang, P. Meer, D. J. Foran, Unsupervised segmentation based on robust estimation and color active contour models, Information Technology in Biomedicine, IEEE Transactions on 9 (3) (2005) 475–486. [25] P. Yampri, C. Pintavirooj, S. Daochai, S. Teartulakarn, White blood cell classification based on the combination of eigen cell and parametric feature detection, in: 1st IEEE Conference on Industrial Electronics and Applications, 2006, pp. 1–4. [26] L. B. Dorini, R. Minetto, N. J. Leite, White blood cell segmentation using morphological operators and scale-space analysis, in: XX IEEE Brazilian Symposium on Computer Graphics and Image Processing, SIBGRAPI 2007, pp. 294–304. [27] G. Zack, W. Rogers, S. Latt, Automatic measurement of sister chromatid exchange frequency, Journal of Histochemistry & Cytochemistry 25 (7) (1977) 741–753. [28] K. Jiang, Q.-M. Liao, Y. Xiong, A novel white blood cell segmentation scheme based on feature space clustering, Soft Computing 10 (1) (2006) 12–19. [29] S. Chinwaraphat, A. Sanpanich, C. Pintavirooj, M. Sangworasil, P. Tosranon, A modified fuzzy clustering for white blood cell segmentation, in: 3rd International Symposium on Biomedical Engineering, 2008, pp. 356–359.
Page 20 of 23
[30] F. Yi, Z. Chongxun, P. Chen, L. Li, White blood cell image segmentation using on-line trained neural network, in: 27th Annual International Conference of the Engineering in Medicine and Biology Society, IEEE-EMBS 2005, pp. 6476–6479. [31] W. Shitong, W. Min, A new detection algorithm (nda) based on fuzzy cellular neural networks for white blood cell detection, IEEE Transactions on Information Technology in Biomedicine, 10 (1) (2006) 5–10. [32] K. T. Atanassov, Intuitionistic fuzzy sets, Fuzzy sets and Systems 20 (1) (1986) 87–96. [33] T. Chaira, Accurate segmentation of leukocyte in blood cell images using atanassov’s intuitionistic fuzzy and interval type ii fuzzy set theory, Micron 61 (2014) 1–8. [34] J. M. Mendel, R. I. B. John, Type-2 fuzzy sets made simple, IEEE Transactions on Fuzzy Systems, 10 (2) (2002) 117–127. [35] H. R. Tizhoosh, Image thresholding using type ii fuzzy sets, Pattern recognition 38 (12) (2005) 2363–2372. [36] K. Atanassov, G. Gargov, Interval valued intuitionistic fuzzy sets, Fuzzy sets and systems 31 (3) (1989) 343–349. [37] H. Bustince, P. Burillo, A theorem for constructing interval valued intuitionistic fuzzy sets from intuitionistic fuzzy sets, Notes on Intuitionistic Fuzzy Sets 1 (1) (1995) 5–16. [38] T. Chen, H. R. Wu, Adaptive impulse detection using center-weighted median filters, IEEE Signal Processing Letters, 8 (1) (2001) 1–3. [39] R. C. Gonzalez, Digital image processing, Pearson Education India, 2009. [40] J. Somasekar, B. E. Reddy, Segmentation of erythrocytes infected with malaria parasites for the diagnosis using microscopy imaging, Computers & Electrical Engineering. [41] R. R. Yager, On a general class of fuzzy connectives, Fuzzy Sets and Systems 4 (3) (1980) 235–242. [42] M. Sokolova, G. Lapalme, A systematic analysis of performance measures for classification tasks, Information Processing & Management 45 (4) (2009) 427–437. [43] L. R. Dice, Measures of the amount of ecologic association between species, Ecology 26 (3) (1945) 297–302. [44] P. Jaccard, The distribution of the flora in the alpine zone, New Phytologist 11 (2) (1912) 37–50.
Page 21 of 23
[45] N. Otsu, A threshold selection method from gray-level histograms, Automatica 11 (285-296) (1975) 23–27. [46] T. Chaira, A. K. Ray, Segmentation using fuzzy divergence, Pattern Recognition Letters 24 (12) (2003) 1837–1844.
Figure 1: Types of leukocytes. Figure 2: Original blood smear images from two datasets D1 and D2. Figure 3: S-channel of the original blood smear images in HSI space. Figure 4: Enhanced S-channel of blood smear images in HSI space. Figure 5: Flowchart of the proposed method. Figure 6: Segmentation results obtained by various methods. Figure 7: Precision-recall and ROC curves of segmented images obtained by different segmentation methods for the dataset D1. Figure 8: Precision-recall and ROC curves of segmented images obtained by various segmentation methods for the dataset D2. Figure 9: Jaccard coefficient. Figure 10: Time taken for segmentation by various methods for the two datasets.
Table 1: Accuracy rate of the segmentation results of the two databases. IFS
T2FS
JATI
KM
GVF
IVIFS
D1
0.9451
0.9641
0.8319
0.9672
0.9727
0.9872
D2
0.7918
0.8266
0.8259
0.7408
0.9274
0.9661
Table 2: Dice coefficient values of the segmentation results of the two datasets. IFS
T2FS
JATI
KM
GVF
IVIFS
D1
0.9214
0.9383
0.7537
0.9395
0.9499
0.9518
D2
0.7591
0.8008
0.7477
0.7131
0.9046
0.9307
Table 3: SSIM of the segmentation results of the two datasets.
D1
IFS
T2FS
JATI
KM
GVF
IVIFS
0.7395
0.8895
0.3217
0.8833
0.9555
0.9717
Page 22 of 23
D2
0.4113
0.5144
0.5043
0.7297
0.9073
0.9451
Page 23 of 23