ARTICLE IN PRESS
Neurocomputing 70 (2007) 704–717 www.elsevier.com/locate/neucom
A neuro-fuzzy-based system for detecting abnormal patterns in wireless-capsule endoscopic images V.S. Kodogiannisa,, M. Boulougourab, J.N. Lygourasc, I. Petrouniasd a
Centre for Systems Analysis, School of Computer Science, University of Westminster, London HA1 3TP, UK Development, Innovations and Projects SWC Communication Systems, SIEMENS S.A., Athens GR-14564, Greece c Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi GR-67100, Greece d School of Informatics, University of Manchester, Manchester M60 1QD, UK
b
Available online 18 October 2006
Abstract Wireless capsule endoscopy (WCE) constitutes a recent technology in which a capsule with micro-camera attached to it, is swallowed by the patient. This paper presents an integrated methodology for detecting abnormal patterns in WCE images. Two issues are being addressed, including the extraction of texture features from the texture spectra in the chromatic and achromatic domains from each colour component histogram of WCE images and the concept of a fusion of multiple classifiers. The implementation of an advanced neuro-fuzzy learning scheme has been also adopted in this paper. The high detection accuracy of the proposed system provides thus an indication that such intelligent schemes could be used as a supplementary diagnostic tool in WCE. r 2006 Elsevier B.V. All rights reserved. Keywords: Medical imaging; Computer-aided diagnosis; Endoscopy; Neuro-fuzzy networks; Fuzzy integral
1. Introduction Conventional medical diagnosis in clinical examinations highly relies upon physicians’ experience. Physicians intuitively exercise knowledge obtained from symptoms of previous patients. In everyday practice, the amount of medical knowledge grows steadily in such a way that it may become difficult for physicians to keep up with all the essential information gained. For physicians to quickly and accurately diagnose a patient there is a critical need to employ computerised technologies for assisting in medical diagnosis and accessing the related information. Computer-assisted technology is certainly helpful for inexperienced physicians in making medical diagnoses as well as for experienced physicians in supporting complex diagnoses. Such technology has become an essential tool to help physicians in retrieving the medical information and making decisions in medical diagnoses [5]. In general, a computer-based medical diagnostic system consists of several units: (1) a sensory unit/data-receiving Corresponding author. Tel.: +44 7775708976; fax: +44 20 79115906.
E-mail address:
[email protected] (V.S. Kodogiannis). 0925-2312/$ - see front matter r 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.neucom.2006.10.024
unit; (2) a data processing/feature extraction unit; (3) a decision support system or a classification unit; and (4) a user interaction unit. The sensory unit or data-receiving unit is used to acquire data from the patient. The data can come from a variety of sources: from electroencephalography (EEG), electrocardiogram (ECG), magnetic resonance imaging (MRI), endoscopy, computerised axial tomography (CAT), positron emission tomography (PET) or from the patient’s interview. The data processing or feature extraction unit is used to prepare data in a form that is easy for a decision support system or a classification unit to use. Compared to the input, the output data from the feature extraction unit is usually of a much lower dimension as well as in a much easier form to classify. The decision support system makes decisions using data from the feature extraction unit. This system provides a list of cause-and-effect reasoning from the symptoms and/or their possible corresponding treatment. Finally, the physician uses the output of the decision support system to assist him/her in the diagnosis. With direct access to a computerbased medical diagnostic system, the physicians can treat the patient promptly and can be more accurate and consistent.
ARTICLE IN PRESS V.S. Kodogiannis et al. / Neurocomputing 70 (2007) 704–717
In the years prior to the introduction of computational intelligence (CI) in medicine, there were different approaches for creating computerised medical decisionmaking applications [36], such as:
database analysis; mathematical modeling; clinical algorithms; and clinical decision analysis.
Medical and biomedical applications seem to be particularly suitable for successfully applying CI tools and techniques, due to the nature of the domain [39]. CI tools and techniques have gone further than just being considered a new, innovative and promising approach for future applications. They have become the method of choice for problems and domains having specific characteristics, such as, high degree of complexity, linguistic representation of concepts or decision variables, high degree of uncertainty, lack of precise or complete data sets, etc. In a sense, CI could be considered a complementary toolbox to standard operational research (OR) tools. A number of medical diagnostic decision support systems (MDSS) based on CI methods have been developed to assist physicians and medical professionals. Some medical diagnostic systems based on CI methods use expert systems (ESs) [21], fuzzy expert systems (FESs) [42], neural networks (NNs) [34] and genetic algorithms (GAs) [35]. However, the development of intelligent systems for medical decision-making problems is not a trivial task. Difficulties include the acquisition, collection and organisation of the data that will be used for training the system. This becomes a major problem especially when the system requires large data sets over long periods of time, which in most cases is not available due to the lack of an efficient recording system. In medical imaging tasks, for example, there is still no scheduled procedure for digitising scans, such as scans printed on film (X-rays) or specimens for microscope screening. Another difficulty arises when trying to automate some processes as not all of them can be automated due to ethical and safety issues. Deciding what could and needs to be automated directly influences the design and implementation of the learning system. Endoscopy differs from traditional medical imaging modalities in several aspects. Firstly, endoscopy is not based on the biophysical response of organs to X-ray or ultrasounds, but allows a direct observation of the human internal cavities via an optical device and a light source. Secondly, the endoscopy investigation imposes a physical contact between the patient and the physician, and the endoscopist can assess the patient’s complaints before the endoscopic procedure. Finally, the patient’s discomfort during the investigation prohibits repeated examinations and, with regard to the usual absence of storage system, no decision element remains available at the end of the examination. This requires that all information is gathered during a limited time period. The objective of this scheme is
705
to increase the expert’s ability in identifying malignant regions and decrease the need for intervention while maintaining the ability for accurate diagnosis. In the endoscopic procedure, doctors can control the motion of the tip of the endoscope by manipulating wires in the shaft of the endoscope from outside the human body. Since the endoscope is introduced into the body passively by pushing it from outside, endurance by patients is required. Even though the gastrointestinal (GI) endoscopic procedure has been widely used, doctors must be skilful and experienced to reach deep sites such as the duodenum and small intestine. Flexible video-endoscopes are being used for applications in gastroenterology for more than 10 years. A miniaturised CCD-imager is integrated on the distal side of such endoscopes to acquire intra-corporeal images in video quality (‘‘chip on a stick’’). This electronic imager is substituting the fibre-optic bundle of conventional large-diameter flexible endoscopes. Video-endoscopic arrangements are also in use for minimal invasive surgery, which becomes increasingly important in many fields of medical applications [7]. The cleaning and sterilisation of these devices is still a problem leading to the desire for disposable instruments. In GI tract, great skill and concentration are required for navigating the endoscope because of its flexible structure. Discomfort to the patient and the time required for diagnosis heavily depend on the technical skill of the physician and there is always a possibility of the tip of the endoscope injuring the walls. Conventional diagnosis of endoscopic images employs visual interpretation of an expert physician. Since the beginning of computer technology, it has become necessary for visual systems to ‘‘understand a scene’’. Computerassisted image analysis can extract the representative features of the images together with quantitative measurements and thus, can ease the task of objective interpretations by a physician expert in endoscopy. A system capable to classify image regions to normal or abnormal will act as a second—more detailed—‘‘eye’’ by processing the endoscopic video. Its exceptional value and contribution in supporting the medical diagnosis procedure is high. Endoscopic images possess rich information [32], which facilitates the abnormality detection by multiple techniques. However, from the literature survey, it has been found that only a few techniques for endoscopic image analysis have been reported and they are still undergoing testing. In addition, most of the techniques were developed on the basis of features in a single domain: chromatic domain or spatial domain. Applying these techniques individually for detecting the disease patterns based on possible incomplete and partial information may lead to inaccurate diagnosis. For example, regions affected with bleeding and inflammation may have different colour and texture characteristics. Parameters in the spatial domain related with lumen can be used to suggest the cues for abnormality. For instance, small area of lumen implies the narrowing of the lumen, which is often one of the
ARTICLE IN PRESS 706
V.S. Kodogiannis et al. / Neurocomputing 70 (2007) 704–717
symptoms for lump formation and not the presence of possible bleeding. Therefore, maximising the use of all available image analysis techniques for diagnosing from multiple feature domains is particularly important to improve the tasks of classification of endoscopic images. Krishnan et al. [26] have been using endoscopic images to define features of the normal and the abnormal colon. New approaches for the characterisation of colon based on a set of quantitative parameters, extracted by the fuzzy processing of colon images, have been used for assisting the colonoscopist in the assessment of the status of patients and were used as inputs to a rule-based decision strategy to find out whether the colon’s lumen belongs to either an abnormal or normal category. The quantitative characteristics of the colon are: hue component, mean and standard deviation of RGB, perimeter, enclosed boundary area, form factor, and centre of mass [22]. The analysis of the extracted quantitative parameters was performed using three different NNs selected for classification of the colon. The three networks include a two-layer perceptron trained with the delta rule, a multilayer perceptron with backpropagation (BP) learning and a self-organizing network. A comparative study of the three methods was also performed and it was observed that the self-organizing network is more appropriate for the classification of colon status [23]. A method of detecting the possible presence of abnormalities during the endoscopy of the lower gastrointestinal system using curvature measures has been developed [24]. In this method, image contours corresponding to haustra creases in the colon are extracted and the curvature of each contour is computed after nonparametric smoothing. Zero-crossings of the curvature along the contour are then detected. The presence of abnormalities is identified when there is a contour segment between two zero-crossings having the opposite curvature signs to those of the two neighbouring contour segments. The proposed method can detect the possible presence of abnormalities such as polyps and tumours. Fuzzy rule-based approaches to the labelling of colonoscopic images to render assistance to the clinician have been proposed. The colour images are segmented using a scale-space filter. Several features are selected and fuzzified. The knowledge-based fuzzy rulebased system labels the segmented regions as background, lumen, and abnormalities (polyps, bleeding lesions) [25]. Endoscopic images contain rich information of texture. Hence, the additional texture information can provide better results for the image analysis than approaches using merely intensity information. Such information has been used in colorectal lesions detector (CoLD), an innovative detection system to support colorectal cancer diagnosis and detection of pre-cancerous polyps, by processing endoscopy images or video frame sequences acquired during colonoscopy [30]. Four statistical measures, derived from the co-occurrence matrix in four different angles, namely angular second moment, correlation, inverse difference moment, and entropy, have been extracted by Karkanis [14]. These second-order statistical features were then
calculated on the wavelet transformation of each image to discriminate amongst regions of normal or abnormal tissue. An NN based on the classic BP learning algorithm performed the classification of the features. CoLD integrated the feature extraction and classification algorithms under a graphical user interface, which allowed both novice and expert users to utilise effectively all system functions. The detection accuracy of the proposed system has been estimated to be more than 95%. Recently, a new scheme for on-line training NN-based colonoscopic diagnosis has been introduced [29]. A memory-based adaptation of the learning rate for the on-line back-propagation is proposed and used to seed an on-line evolution process that applies a differential evolution (DE) strategy to (re-) adapt the NN to modified environmental conditions. The specific approach looks at on-line training from the perspective of tracking the changing location of an approximate solution of a pattern-based, and thus, dynamically changing, error-function. Results in interpreting colonoscopy images and frames of video sequences were promising and suggest that networks trained with this strategy detect malignant regions of interest with accuracy. Since the treatment in medical applications, such as colonoscopy or gastroscopy, is demanding for the doctor and painful or at least unpleasant for the patient, the interest in miniaturised systems for smaller and more flexible instruments is obvious. Although the technology of endoscopy improves and is applied for various inspections of the digestive system, such as the colon and stomach, it is very hard for the conventional endoscope to reach duodenum and small intestine. This has caused also the desire for autonomous instruments without the bundles of optical fibres and tubes, which are more than the size of the instrument itself. The latter formed the main reason for patients objecting to their use. Thus, already in 1952, a swallowable ‘‘radio pill’’ for pressure measurements with a volume of 1 cm3 was produced. pH detection with pills applying wireless data transmission are used since 30 years ago. Recently, an autonomous image vision system with wireless data transmission integrated in a small pill received the approval for clinical evaluation in the US and Europe [33]. Autonomous video-probes for gastroenterology with the expected performance shall reach at least 10% of the endoscope market within the next 5 years. The advantages for users, especially for the patient should lead to a higher amount of instruments with this approach and reduce the barrier for doctor and patient to apply this technique [2]. A long-term estimation of the reachable volume of these instruments could be a few per cent (10–20%) of the number of abdominal surgeries in industrialised countries. The use of highly integrated microcircuit in bioelectric data acquisition systems promises new insights into the origin of a large variety of health problems by providing lightweight, low-power, low-cost medical measurement devices. WCE has been proposed for use in management of patients who have obscure digestive tract bleeding [28]. Patients, who have such kind of bleeding, present
ARTICLE IN PRESS V.S. Kodogiannis et al. / Neurocomputing 70 (2007) 704–717
707
diagnostic and therapeutic challenges because the source of the bleeding is not identified by conventional radiological or endoscopic evaluation. The first swallowable videocapsule (M2A) for the gastroenterological diagnosis has been presented by given imaging, a company from Israel [13], and its schematic diagram is illustrated in Fig. 1. The system is equipped with a CMOS imager, a transmitter, LEDs and a battery. The camera-capsule has a 1 cm section and a length of 3 cm so it can be swallowed with minimal effort. It is disposable, ingestible and acquires video images during natural propulsion through the digestive system. The capsule transmits the acquired images via a digital RF communication channel to the recorder unit located outside the body. The data recorder is an external receiving/recording unit that receives the data transmitted by the ingestible capsule. The portable data recorder consists of a sensor array attached to the abdomen. The capsule is made of a specially sealed biocompatible material that is resistant to the digestive fluids throughout the GI tract. In 24 h, the capsule is crossing the patient’s alimentary canal. However, the restrictions of this device are the image quality in terms of dynamic range, resolution and colour quality, the lack of control of the movements or direction, limited battery life and no direct localisation of the device. A computerised system for assisting the analysis of WCE data using the M2A capsule, and identifying sequences of frames related to small intestine motility has been implemented recently [37]. The imbalanced recognition task of intestinal contractions was addressed by employing an efficient two-level video analysis system. At the first level, each video was processed resulting in a number of possible sequences of contractions. In the second level, the recognition of contractions was carried out by means of a support vector machines classifier. To encode patterns of intestinal motility, a panel of textural and morphological features of the intestine lumen were extracted. The system exhibited an overall sensitivity of 73.53% in detecting contractions. In order to automatically cluster the different types of intestinal contractions in WCE, a computer vision system which describes video sequences in terms of classical image descriptors has been developed [40]. The authors used self-organised maps (SOM) to build a two-dimensional representation of the different types of contractions, which were clustered by the SOM in a non-supervised way. In an alternative approach, a methodology for detecting bleeding in WCE images has been investigated [4].
The presented methodology uses colour adaptation using a reference colour value of the image. The adaptation is carried out by means of the cone response transform and the reference colour is detected using NNs. This method is applied to gastroenterological images in order to detect the presence of blood in a specific frame sequence. The colour histograms are first estimated for (i) the whole image and (ii) the designated blood area and are used respectively as input and output in the training phase. The image colour is detected in the testing phase after colour adaptation and estimation of deviation from the grey value axis in the RGB cube. The M2A capsule restrictions have been addressed in a recently completed European research project the ‘‘Intracorporeal Videoprobe—IVP’’ [1]. In this project, two prototypes of products have been developed:
Fig. 1. Given imaging capsule.
Fig. 2. IVP system.
a small wired video-probe (IVP1) that includes a CMOS image sensor with an image field of 200 180 pixels and a pixel size of 4.6 mm. The overall chip size is 1.7 1.3 mm2 and the device has 4 connections. an autonomous video-capsule (IVP2) with a telemetric link for image data transfer to an external PC-based system. The video-capsule IVP2 contains a high dynamic range CMOS image sensor with a resolution of 768 496 pixels, optics and illumination, a transceiver, a system control with image data compression unit and a power supply. The sensor has a sensitivity of 120 (digits/decade). Its dynamic range is 150 dB and the minimal detectable illumination is 0.005 lux. The optical part is located on a tiltable plate, which is driven by a wobble motor. The overall chip size is 3.6 2.3 mm2. Both approaches are illustrated in Fig. 2.
For the purpose of this research work, which was supported by the ‘‘IVP’’ research project, endoscopic images have been obtained initially from the M2A microcapsule. They have spatial resolution of 171 151 pixels, a brightness resolution of 256 levels per colour plane (8 bits), and consist of three colour planes (red, green and blue) for a total of 24 bits/pixel. The proposed methodology in this paper consists of two phases. The first implements the extraction of image features while in the second phase two soft-computing methodologies are implemented/employed to perform the diagnostic task. Their results are analysed and then compared.
ARTICLE IN PRESS 708
V.S. Kodogiannis et al. / Neurocomputing 70 (2007) 704–717
Texture analysis is one of the most important features used in image processing and pattern recognition. It can give information about the arrangement and spatial properties of fundamental image elements. Many methods have been proposed to extract texture features, e.g. the cooccurrence matrix [6], and the texture spectrum in the achromatic component of the image [9]. The definition and extraction of quantitative parameters from endoscopic images based on texture information has been proposed. This information was initially represented by a set of descriptive statistical features calculated on the histogram of the original image. Recently, a methodology for extraction texture features from the texture spectra in the chromatic and achromatic domains for a selected region of interest from each colour component histogram of endoscopic images has been presented. The implementation of an intelligent diagnostic system was based on various learning schemes such as radial basis functions (RBFs) [41], extended normalised radial basis function (ENRBF) NNs [3] and fuzzy inference NNs [16]. In this research, an alternative approach of obtaining those quantitative parameters from the texture spectra is proposed both in the chromatic and achromatic domains of the image. The definition of texture spectrum employs the determination of the texture unit (TU) and TU number (NTU) values. TUs characterise the local texture information for a given pixel and its neighbourhood, and the statistics of the entire TU over the whole image reveal the global texture aspects. For the diagnostic part, the concept of multiple-classifier scheme has been adopted, where the fusion of the individual outputs was realised using fuzzy integral (FI). An intelligent classifier has been implemented in this study based on an adaptive neuro-fuzzy logic methodology. The proposed scheme utilises an alternative to the centroid defuzzification method, namely area of balance (AOB). 2. Image features extraction A major component in analysing images involves data reduction, which is accomplished by intelligently modifying the image from the lowest level of pixel data into higher level representations. Texture is broadly defined as the rate
and direction of change of the chromatic properties of the image, and could be subjectively described as fine, coarse, smooth, random, rippled, and irregular, etc. For this reason, we focused our attention on nine statistical measures (standard deviation, variance, skew, kurtosis, entropy, energy, inverse difference moment, contrast, and covariance) [3]. All texture descriptors were estimated for all planes in both RGB {R (red), G (green), B (blue)} and HSV {H (hue), S (saturation), V (intensity)} spaces, creating a feature vector for each descriptor Di ¼ (Ri, Gi, Bi, Hi, Si, Vi). Thus, a total of 54 features (9 statistical measures 6 image planes) were then estimated. For our experiments, we have used 70 endoscopic images related to abnormal cases and 70 images related to normal ones. Fig. 3 shows samples of selected images acquired using the M2A capsule of normal and abnormal cases. Generally, the statistical measures are estimated on histograms of the original image (1st-order statistics) [8]. However, the histogram of the original image carries no information regarding relative position of the pixels in the texture. Obviously, this can fail to distinguish between textures with similar distributions of grey levels. We therefore have to implement methods which recognise characteristic relative positions of pixels of given intensity levels. An additional scheme is proposed in this study to extract new texture features from the texture spectra in the chromatic and achromatic domains, for a selected region of interest from each colour component histogram of the endoscopic images. 2.1. NTU transformation The definition of texture spectrum employs the determination of the TU and TU number (NTU) values. TU may be considered as the smallest complete unit which best characterises the local texture aspect of a given pixel and its neighbourhood in all eight directions of a square raster. In a square raster digital image each pixel is surrounded by eight neighbouring pixels. The local texture information for a pixel can be extracted from a neighbourhood of 3 3 pixels, which represents the smallest complete unit (in the sense of having eight directions surrounding the pixel).
Fig. 3. Sample of WCE images.
ARTICLE IN PRESS V.S. Kodogiannis et al. / Neurocomputing 70 (2007) 704–717
TUs thus characterise the local texture information for a given pixel and its neighbourhood, and the statistics of all the TUs over the whole image reveal the global texture aspects [18]. Given a neighbourhood of d d pixels, which are denoted by a set containing d d elements P ¼ fP0 ; P1 ; . . . ; PðddÞ1 g, where P0 represents the chromatic or achromatic (i.e. intensity) value of the central pixel and Pi fi ¼ 1; 2; . . . ; ðd dÞ 1g is the chromatic or achromatic value of the neighbouring pixel i, the TU ¼ fE 0 ; E 1 ; . . . ; E ðddÞ1 g, where E i fi ¼ 1; 2; . . . ; ðd dÞ 1g is determined as follows: 8 > < 0 if Pi oP0 ; (1) E i ¼ 1 if Pi ¼ P0 ; > : 2 if P 4P : i 0 The element Ei occupies the same position as the ith pixel. Each element of the TU has one of three possible values; therefore the combination of all the eight elements results in 6561 possible TUs in total. The TU number (NTU) is the label of the TU and is defined using the following equation: N TU ¼
ðddÞ1 X
E i di1 ,
(2)
i¼1
where in our case d ¼ 3. In addition, the eight elements may be ordered differently. If the eight elements are ordered clockwise as shown in Fig. 4, the first element may take eight possible positions from the top left (a) to the middle left (h), and then the 6561 TUs can be labelled by the Eq. (1), under eight different ordering ways (from a to h). Fig. 5 provides an example of transforming a neighbourhood to a TU with the TU number under the ordering way a. The previously defined set of 6561 TUs describes the local-texture aspect of a given pixel; that is, the relative grey-level relationships between the central pixel and its neighbour. Thus the statistics of the frequency of occurrence of all the TUs over a large region of an image should reveal texture information. The texture spectrum histogram (Hist(i))is obtained as the frequency distribution
a
b
d
h
g
c
f
e
Fig. 4. Eight clockwise, successive ordering ways of the eight elements of the texture unit. The first element Ei may take eight possible positions from a to h.
neighbourhood
709
Texture Unit
63
28
45
2
88
40
35
2
67
40
21
2
V = {40,63,28,45,35,21,40,67,88}
0
TU number
2 0
1
0
TU = {2,0,2,0,0,1,2,2}
NTU = 6095
Fig. 5. Example of transforming a neighbourhood to a texture unit with the texture-unit number.
of all the TUs, with the abscissa showing the NTU and the ordinate representing its occurrence frequency. The texture spectra of various image components {V (intensity), R (red), G (green), B (blue), H (hue), S (saturation)} are obtained from their TU numbers. The statistical features are then estimated on the histograms of the NTU transformations of the chromatic and achromatic planes of the image (R, G, B, H, S, V). 3. Image features evaluation Recently, the concept of combining multiple classifiers has been actively exploited for developing highly reliable ‘‘diagnostic’’ systems [17]. One of the key issues of this approach is how to combine the results of the various systems to give the best estimate of the optimal result. In this research study, six subsystems have been developed, and each of them was associated with the six planes specified in the feature extraction process (i.e. R, G, B, H, S, and V). For each subsystem, 9 statistical features have been associated with, resulting thus a total 54 features space. Each subsystem was modelled with the proposed neurofuzzy learning scheme. This provides a degree of certainty for each classification based on the statistics for each plane. The outputs of each of these networks must then be combined to produce a total output for the system as a whole as can be seen in Fig. 6. While a usual scheme chooses one best subsystem from amongst the set of candidate subsystems based on a winner-takes-all strategy, the current proposed approach runs all multiple subsystems with an appropriate collective decision strategy. The aim in this study is to incorporate information from each plane/space so that decisions are based on the whole input space. The adopted in this paper methodology was to use the FI concept. FI is a promising method that incorporates information from each space/plane so that decisions are based on the whole input space in the case of multiple classifier schemes. FI combines evidence of a classification with the systems expectation of the importance of that evidence. By treating the classification results a series of disjointed subsets of the input space Sugeno defined the gl-fuzzy measure [27]: gðA [ BÞ ¼ gðAÞ þ gðBÞ þ lgðAÞgðBÞ;
l 2 ð1; 1Þ,
(3)
ARTICLE IN PRESS V.S. Kodogiannis et al. / Neurocomputing 70 (2007) 704–717
710
where S is the support of mB(y). One problem of the centroid defuzzifier is intensive computation. The CA defuzzifier, on the other hand, is easy to compute and use. The CA defuzzifier has the formula
R F U S I O N
G B H
PM m ¯ mBm ð¯ym Þ m¼1 y yc ¼ P , M ym Þ m¼1 mBm ð¯
S V Fig. 6. Proposed fusion scheme.
where the l measure can be given by solving the following non-linear equation: lþ1¼
K Y
(6)
where y¯ m is the centre of gravity of the fuzzy set Bm and M is the number of rules. The problem with the CA defuzzifier is that it suffers from not using the entire shape of the consequent membership function. Regardless of whether the max–min or max–product inference is used, the result of the CA defuzzifier is still the same, regardless of whether the shape is narrow or wide. 4.1. Adaptive fuzzy logic system (AFLS)
i
1 þ lg ;
l4 1.
(4)
i¼1
The gi, i 2 f1; . . . ; Kgvalues are fuzzy densities relating to the reliability of each of the K feature networks and satisfy the conditions of fuzzy sets laid out by Sugeno. The classification scheme utilised here is an adaptive fuzzy logic system (FLS) that incorporates an improved defuzzification scheme. 4. Neuro-fuzzy classifier An FLS is a system utilising fuzzy set theory and its operations to solve a given problem. The FLS can be classified into three broad types: pure FLSs, Takagi and Sugeno’s fuzzy system and FLSs with fuzzifier and defuzzifier [31]. The FLS with fuzzifier and defuzzifier is used throughout this research work. It composes of fuzzifier, fuzzy rule base, fuzzy inference engine and defuzzifier. In classification problems with multi-classes, the FLS used as the classifier will be a multi-input multioutput (MIMO) FLS. Although, the decomposed system is easier to analyse than a non-decomposed one, the overall system may be too large and not as compact as for NNs. In this study, an MIMO FLS will be used as a composed FLS. In this way, the fewer rules are required. The output of an inference engine is an area with an irregular shape depending on the membership functions used in the consequent part, the membership value of each rule and the inference type used. Defuzzification converts this area to a real value. There are several methods used in defuzzification such as centre average (CA), centroid of area, etc. The centroid calculation approach is a logical answer to defuzzification because it uses all information available to compute the output. Centroid defuzzification can be put into equation form as R ymB ðyÞ dy , y¯ ¼ RS S mB ðyÞ dy
An adaptive FLS is an FLS having adaptive rules. Its structure is the same as a normal FLS but its rules are derived and extracted from given training data. In other words, its parameters can be trained like an NN approach, but with its structure in an FLS structure. Since we have general ideas about the structure and effect of each rule, it is straightforward to effectively initialise each rule. This is a tremendous advantage of AFLS over its NN counterpart. The AFLS is one type of FLS with a singleton fuzzifier and CA defuzzifier. The centroid defuzzifier cannot be used because of its computation expense and that it prohibits using the BP learning algorithm. The proposed in this research work AFLS consists of an alternative to classical defuzzification approaches, the AOB [15]. This AFLS has the same approach as the system presented by Wang [43] and its feed-forward structure is shown in Fig. 7 with an extra ‘‘fuzzy basis’’ layer. The fuzzy basis layer consists of fuzzy basis nodes for each rule. A fuzzy basis node has the following form: m ðxÞ ¯ fm ðxÞ , ¯ ¼ PLm ¯ l¼1 ml ðxÞ
(7)
where fm ðxÞ ¯ is a fuzzy basis node for rule m and mm ðxÞ ¯ is a membership value of rule m.
Fuzzifier
Inference Engine Summation L11 Rule#1
y11
x1 x2
O1
Rule#2
LRP
y1R
OP yR P
xn Rule#R
(5)
Defuzzifier
LpR
Division
Fig. 7. AFLS concept.
ARTICLE IN PRESS V.S. Kodogiannis et al. / Neurocomputing 70 (2007) 704–717
711
Since we use a product-inference, the fuzzy basis node mm ðxÞ ¯ is in the following form: mm ðxÞ ¯ ¼
n Y
mF mi ðxi Þ,
(8)
i¼1
where mF mi ðxi Þ is a membership value of the ith input of rule m. In case of a ‘‘Gaussian-shape’’ membership function, then, mF mi ðxi Þ will be in the following form: " # 2 ðxi cm Þ i mF mii ðxi Þ ¼ exp , (9) 2 2ðbm i Þ m where cm i and bi are the centre and spread parameters, respectively, of the membership function ith input of the mth rule. The choice of nonlinear Gaussian functions against classic linear triangular-shaped functions enhances the non-linear nature of the proposed architecture as well as contributes to an improved classification performance [11]. The most popular defuzzification method is the centroid calculation that returns the centroid of the area formed by the consequent membership function, the membership value of its rules and the max-min or maxproduct inference. In the case of a discrete universe, the centroid calculation yields PQ q q q¼1 my ðyb Þyb y ¼ PQ , (10) q q¼1 my ðyb Þ
where Q is a number of quantisation levels of the output. The higher Q is, the finer y will be. The computation will increase as Q increases as a trade-off. Thus, this method is very computationally expensive, so that it is not appropriate to use this method in AFLS. In other words, it will be very computationally expensive to make this method adaptive. Since this method provides good performance, its main characteristics, centre of gravity and use of the shape of membership function, will be preserved in the design of a new defuzzification approach. The overall output of the system may be the result of fuzzy union or the addition of rule outputs as in Kosko’s method. The proposed in this paper AFLS will utilise Kosko’s method with product inference [20]. The density (D) is defined as mass (M) per unit volume (V): M . (11) V Let assume that we use the same material and all shapes have the same thickness, T, then
D¼
M ¼ ATD,
(12)
where A is an area and T is a thickness. Let us suppose that the shape of the membership function used in the consequent part is symmetric, e.g., triangular, Gaussian or bell shape. The centre of gravity will pass through the half-way point of the base of that shape. For example, if we use a triangular shape and product-inference as a t-norm, then the shape of the consequent part of rule m will be shown as in Fig. 8. The shaded area (A) is 12mm Lm .
Fig. 8. Triangular shape membership function.
massless beam O yb1
yb2 yb3
massless beam
M1g
M2g
M3g
O yb1
yb2 yb3 y
F
Fig. 9. Consequent fuzzy set placed on massless beam.
Imagine that we have the consequent part of each rule placed on the massless beam having the pivot point at origin as shown in Fig. 9. Then, F ¼ M 1 g þ M 2 g þ M 3 g ¼ ðM 1 þ M 2 þ M 3 Þg.
(13)
For balance Fy ¼ M 1 gy1b þ M 2 gy2b þ M 3 gy3b ¼ ðM 1 y1b þ M 2 y2b þ M 3 y3b Þg, ðM 1 y1b þ M 2 y2b þ M 3 y3b Þg F ðM 1 y1b þ M 2 y2b þ M 3 y3b Þg . ¼ ðM 1 þ M 2 þ M 3 Þg
ð14Þ
y¼
ð15Þ
Assume that D in Eq. (12) is the same, thus, ðA1 y1b þ A2 y2b þ A3 y3b ÞTD ðA1 þ A2 þ A3 ÞTD ðA1 y1b þ A2 y2b þ A3 y3b Þ . ¼ ðA1 þ A2 þ A3 Þ
y¼
ð16Þ
The area, A, will be depended upon the membership function used. Under the assumption of symmetric shape this method will have comparable capability with the centroid calculation method to approximate the output from the fuzzy set in the consequent part. If the triangle is used as a membership function and we use the maxproduct inference (Larsen logic), then the shaded area, A, will be derived as 1 Am ¼ mm Lm . 2
(17)
ARTICLE IN PRESS V.S. Kodogiannis et al. / Neurocomputing 70 (2007) 704–717
712
From Eq. (16), the output, y will be y¼
ðm1 L1 y1b þ m2 L2 y2b þ m3 L3 y3b Þ . ðm1 L1 þ m2 L2 þ m3 L3 Þ
(18)
In general form, the calculation of the output, y, will be PM m m m¼1 mm Lp yp yp ¼ PM , (19) m m¼1 mm Lp where yp is the pth output of the network, mm the membership value of the mth rule, Lm p the spread parameter of the membership function in the consequent part of the pth output of the mth rule, ym p the centre of the membership function in the consequent part of the pth output of the mth rule.
Jk,
(20)
where K is the number of training patterns, Jk is the sum of squared error for the kth pattern. Then, Jk is defined as (21)
m m m ym p ðn þ 1Þ ¼ yp ðnÞ þ my ½yp ðnÞ yp ðn 1Þ qJ Zy m , qyp
i
ð22Þ
n
qJ qJ qJ m qep qyp ¼ , m qyp qJ m qep qyp qym p
(23)
where
qmF mi
(24)
where is the pth output of the network corresponding to the kth pattern in the training data, d kp the pth desired output of the kth pattern and mkm the membership value of the mth rule corresponding to the kth pattern in the training data. The update equation of Lm p is in the following form: þ
qJ ZL m , qLp
Lm p ðn
1Þ ð25Þ
ðxki cm i Þ bm i
(30)
2
and n Y qmm ¼ mF j . i qmF mi j¼1
(31)
Thus, "( ) # m m k K P k m X X L ½y y qJ ðx c Þ p p p ¼ ðykp d kp Þ PM mkm i m2 i . k m qcm bi i p¼1 k¼1 m¼1 mm Lp The update equation of the spread parameter, the form
bm i ,
is in
m m m bm i ðn þ 1Þ ¼ bi ðnÞ þ mb ½bi ðnÞ bi ðn 1Þ qJ Zb m , qb
ð33Þ
n
qJ qJ qJ m qep qyp qmm qmF mi ¼ þ . qbm qJ qbm m qep qyp qmm qmF m i i i
(34)
Again for the ‘‘Gaussian-shape’’ case, qmF mi qbm i
¼ mF mi
2 ðxki cm i Þ
bm i
3
.
(35)
Therefore, "( ) # m k K P k m 2 X X Lm qJ p ½yp yp k k k ðxi ci Þ ¼ ðyp d p Þ PM mm . 3 k m qbm bm i p¼1 k¼1 m¼1 mm Lp i (36)
n
where qJ qJ qJ m qep qyp ¼ m, qLm qJ m qep qyp qLp p
(29)
where
ykp
mL ½Lm p ðnÞ
¼ mF mi
i
K X mkm Lm p ðykp d kp Þ PM , ¼ k m m k¼1 m¼1 m Lp
Lm p ðnÞ
ð28Þ
n
(32)
where
þ 1Þ ¼
m m m cm i ðn þ 1Þ ¼ ci ðnÞ þ mc ½ci ðnÞ ci ðn 1Þ qJ Zc m , qc
jai
where P is the number of outputs and dp the desired response of the pth output. yp ðx¯ k Þ is defined as in Eq. (19). The update equation of ym p is as in the form
Lm p ðn
where ym p is interpreted as a centre of the membership function of the pth output of the mth rule in the consequent part of IF-THEN rule. The update equation of the centre parameter, cm i , is in the form
qcm i
k¼1
P 1X Jk ¼ ðy ðx¯ k Þ d p ðx¯ k ÞÞ2 , 2 p¼1 p
(27)
p
In the present study, a ‘‘Gaussian-shape’’ function has been adopted, hence
Let us define the objective function as K X
m¼1 m
qJ qJ qJ m qep qyp qmm qmF mi ¼ þ . m qci qJ m qep qyp qmm qmF mi qcm i
4.2. AFLS training with AOB defuzzification
J¼
2 3 K n o k X qJ m m k 5 m 4ðyk d k Þ h ¼ p p PM k m i yp yp , qLm p m L k¼1
(26)
All equations derived are used to update all parameters during the training phase of the network. The initial centre, cm i and ypm are randomly selected from the kth training data, xki and d kp , respectively. The initial spread parameter,
ARTICLE IN PRESS V.S. Kodogiannis et al. / Neurocomputing 70 (2007) 704–717
bm i , is determined by maxðxi Þ minðxi Þ , (37) N where bi is a spread parameter of the ith input of all rules and N is the number of rules. Since the desired output in the classification problem is primarily a binary output representing each class, therefore, the initial spread parameter, Lm p , can be set to 0.75. This method can be interpreted that we have initially equal confidence in each rule. This spread parameter will be adjusted during training. bi ¼
5. Results The proposed approach was evaluated using 140 clinically obtained endoscopic M2A images. For the present analysis, two decision-classes were considered: abnormal and normal. Seventy images (35 abnormal and 35 normal) have been used for the training and the remaining ones (35 abnormal and 35 normal) for testing purposes. The extraction of quantitative parameters from these endoscopic images is based on texture information. Initially, this information is represented by a set of descriptive statistical features calculated on the histogram of the original image. In addition to the proposed AFLS scheme, an RBF network and a fuzzy inference neural network (FINN) have been implemented for comparison purposes. All of these types of networks (i.e. AFLS, FINN and RBF) are incorporated into a multiple classifier scheme, where the structure of each individual (for R, G, B, H, S, & V planes) classifier is composed of 9 input nodes (i.e. nine statistical features) and 2 output nodes. In a second stage, the nine statistical measures for each individual image component are then calculated though the related texture spectra after applying the (NTU) transformation. 5.1. Performance of histograms-based features The multi classifier consisting of AFLS networks with 9 input nodes and 2 output nodes each was trained on these six feature spaces. This specific algorithm generally is
Performance
characterised by the ‘‘quality’’ of its performance [18]. The network trained on the R feature space achieved an accuracy of 92.85% on the testing data incorrectly classifying 3 of the normal images as abnormal and 2 abnormal as normal ones. The network trained on the G feature space misclassified 2 abnormal images as normal but not the same ones as the R space. The remaining one image was misclassified as normal ones. The B feature space achieved an accuracy of 94.28% on the testing data with 4 misclassifications, i.e. 2 abnormal as normal ones and the remaining two images as abnormal ones. The network trained on the H feature space achieved 94.28% accuracy on the testing data. The network trained on the S feature space achieved an accuracy of only 91.42% on the testing data. Finally, the network for the V feature space misclassified 2 normal cases as abnormal ones, giving it an accuracy of 97.14% on the testing data. The soft combination of neural classifiers using FI fusion concept resulted in 92.85% accuracy over the testing dataset (5 mistakes out of 70 testing patterns), demonstrating in this way the efficiency of this scheme in terms of accuracy. More specifically, 2 normal cases have been considered as abnormal while 3 abnormal as normal ones. Despite the fact that training time was longer than the one required for the RBF case, it is worth mentioned that results, as shown in Fig. 10, indicate a higher confidence levels for each correct classification over the RBF scheme, such as 0.63, while Table 1 presents the performance of individual components. In a similar way, the multiple-classifier scheme using the RBF network has been trained on the six feature spaces. RBF networks are considered one of the most popular feedforward networks. A number of different learning schemes have been proposed for the training of this type of systems, including the gradient descent scheme and probabilisticbased methods [12]. In general, for large numbers of inputs units, the number of RBFs required, can become excessive. If too many centres are used, the large number of parameters available in the regression procedure will cause the network to be over sensitive to the details of the particular training set and result in poor generalisation performance, including the case of trapping in local minima, a frequent case that occurs in feed-forward networks [10].
Histogram-AFLS
1 0.8 0.6 0.4 0.2 0
1
4
713
7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 Normal / Abnormal Cases Normal Abnormal Fig. 10. AFLS with FI fusion for histogram.
ARTICLE IN PRESS V.S. Kodogiannis et al. / Neurocomputing 70 (2007) 704–717
714
The regularised orthogonal least square (ROLS) algorithm has been adopted in this study as a learning methodology for the RBF network [17]. The proposed algorithm utilises an orthogonalisation procedure that overcomes the traditional learning problems in RBF networks. The network trained on the R feature space achieved an accuracy of 91.42% on the testing data incorrectly classifying 2 of the normal images as abnormal and 4 abnormal as normal ones. The network trained on the G feature space misclassified 3 normal images as abnormal but not the same ones as the R space. The remaining 2 images were misclassified as normal ones. The B feature space achieved an accuracy of 88.57% on the testing data with 8 misclassifications, i.e. 5 abnormal as normal ones and the remaining 3 images as abnormal ones. The network trained on the H feature space achieved 90% accuracy on the testing data. The network trained on the S feature space achieved an accuracy of only 85.71% on the testing data. Finally, the network for the V feature space misclassified 3 normal cases as abnormal and one abnormal as normal one, giving it an accuracy of 94.28% on the testing data. The FI concept has been used here to combine the results from each sub-network and the overall system misclassified 3 normal cases as abnormal and 5 abnormal as normal ones, giving the system an overall accuracy of 88.57%. Table 1 indicates the performance of individual components. The confidence levels for each correct classification in the RBF case is just above 0.50.
Table 1 Histogram-based performances Modules
RBF accuracy (70 testing patterns)
AFLS accuracy (70 testing patterns)
R G B H S V
91.42% 92.85% 88.57% 90% 85.71% 94.28%
92.85% 95.71% 94.28% 94.28% 91.42% 97.14%
Overall
88.57% (8 mistakes)
Performance
(6 mistakes) (5 mistakes) (8 mistakes) (7 mistakes) (10 mistakes) (4 mistakes)
1 0.8 0.6 0.4 0.2 0
(5 (3 (4 (4 (6 (2
mistakes) mistakes) mistakes) mistakes) mistakes) mistakes)
92.85% (5 mistakes)
5.2. Performance of Ntu-based features In the NTU-based extraction process, the texture spectrum of the six components (R, G, B, H, S, V) have been obtained from the TU numbers, and the same nine statistical measures have been used in order to extract new features from each textures spectrum. The AFLS network trained on the R feature space achieved an accuracy of 94.28% on the testing data incorrectly classifying 2 of the normal images as abnormal and 2 abnormal as normal ones. The network trained on the G feature space misclassified 3 normal images as abnormal but not the same ones as the R space. The B feature space achieved an accuracy of 97.14% on the testing data with 2 misclassifications, i.e. one abnormal as normal one and the remaining one image as abnormal one. The network trained on the H feature space achieved 97.14% accuracy on the testing data. The network trained on the S feature space achieved an accuracy of only 92.86% on the testing data. Finally, the network for the V feature space misclassified two normal cases as abnormal ones, giving it an accuracy of 97.14% on the testing data. The soft combination of AFLS classifiers using FI fusion concept resulted in 95.71% accuracy over the testing data set (3 mistakes out of 70 testing patterns), demonstrating in this way again the efficiency of this scheme in terms of accuracy. More specifically, 2 normal cases as abnormal and one abnormal as normal one provide us a good indication of a ‘‘healthy’’ diagnostic performance [19]. However, the level of confidence in this case was slight less than the previous case (i.e. the histogram), that is 0.59 as shown in Fig. 11. Table 2 shows the performance of the individual components for the various methodologies. In a similar way, a multi classifier consisting of RBF networks with 9 input nodes and 2 output nodes was trained on each of the six feature spaces. The NTU transformation of the original histogram has produced a slight but unambiguous improvement in the diagnostic performance of the multi-classifier scheme. The FI concept which has been used to combine the results from each subnetwork and the overall system misclassified 3 normal cases as abnormal and 3 abnormal as normal ones, giving the system an overall accuracy of 91.43%. The confidence level for each correct classification is above 0.55. It seems that Ntu - AFLS
1
4
7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 Normal / Abnormal Cases Normal Abnormal Fig. 11. AFLS with FI fusion for NTU.
ARTICLE IN PRESS V.S. Kodogiannis et al. / Neurocomputing 70 (2007) 704–717
715
Table 2 NTU-based performances Modules
RBF accuracy (70 testing patterns)
AFLS accuracy (70 testing patterns)
FINN accuracy (70 testing patterns)
R G B H S V
92.85% 95.71% 91.43% 92.85% 90.00% 94.28%
94.28% 95.71% 97.14% 97.14% 92.86% 97.14%
94.28% 97.14% 92.85% 94.28% 95.71% 91.42%
Overall
91.43% (6 mistakes)
(5 (3 (6 (5 (7 (4
mistakes) mistakes) mistakes) mistakes) mistakes) mistakes)
¼P
P
mistakes) mistakes) mistakes) mistakes) mistakes) mistakes)
95.71% (3 mistakes)
the application of NTU-based extraction process has resulted in a slight better confidence level as well as an improved accuracy rate. Just for comparison purposes, an FINN scheme has been employed in the NTU case [16]. The FI concept provided an accuracy of 94.28%. More specifically, 3 normal cases as abnormal and one abnormal as normal one provide us a good indication of a ‘‘healthy’’ diagnostic performance. However, the level of confidence/ certainty was 0.52. Medical diagnostic tests are often perceived by physicians as providing absolute answers or as clarifying uncertainty to a greater degree than is warranted. When a diagnosis turns out to be at variance with the results of a diagnostic test, the clinician’s assumption may be that the test was either misinterpreted or that the test is no good. Such a binary approach to the interpretation of diagnostic testing, i.e., assuming a clearly positive or negative result (like an off–on switch), is too simplistic and may be counterproductive in the workup of a patient. Instead, the results of a diagnostic test should be viewed on a continuum from negative to positive and as giving the likelihood or probability of a certain diagnosis. The performance of an NN is usually expressed in terms of its estimation and prediction rates, that is, the number of correctly classified objects in the train and prediction sets, respectively. While these estimators may be adequate in certain instances, e.g., general classification tasks, they should not be employed when medical data is involved [38]. This is because such variables give only a measure of overall performance. In the case of human beings it is crucial to assess the capacity of a test to distinguish between people with (true positive) and without (true negative) a disease, as those individuals may or may not be subjected to further evaluations that may be stressing, costly, etc., depending upon the result of the test. In these cases, the sensitivity and the specificity are more adequate. The performance of all the classification tools evaluated in the present work was thus assessed in terms of these variables which are estimated according to [38] sensitivity
(4 (3 (2 (2 (5 (2
true positive alarms P , true positive alarms þ false negative alarms
mistakes) mistakes) mistakes) mistakes) mistakes) mistakes)
94.28% (4 mistakes)
specificity ¼P
(4 (2 (5 (4 (3 (6
P
true negative alarms . true negative alarms þ false positive alarms
However, although sensitivity and specificity partially define the efficacy of a diagnostic test, they do not answer the clinical concern of whether a patient does or does not have a disease. These questions are addressed by calculating the predictive value of the test predictability ¼P
P
true positive alarms P . true positive alarms þ false positive alarms
The above soft computing methodologies for the NTU cases were verified through the calculation of these parameters: sensitivity, 97.18%, specificity, 98.55% and predictability 98.57% for the AFLS case, while for the RBF one: sensitivity, 95.71%, specificity, 95.71% and predictability 95.71%. 6. Conclusions The main contribution of this research work was the medical application of wireless-capsule endoscopic Microsystems. The development of a pattern recognition system for texture characterisation and classification of capsuleendoscopic images has been studied. Such systems consist mainly of two parts:
a feature extraction stage, which maps an image sample onto a feature vector, so that each sample is represented by a point in an n-dimensional feature space, and finding a ‘‘decision rule’’ to determine the class of a sample given its feature vector.
In this research, schemes have been developed to extract texture features from the texture spectra in the chromatic and achromatic domains from each colour component histogram of images acquired by the M2A given imaging swallowable capsule. The implementation of advanced learning-based schemes and the concept of fusion of multiple classifiers have been also considered.
ARTICLE IN PRESS 716
V.S. Kodogiannis et al. / Neurocomputing 70 (2007) 704–717
A major component in analysing images involves data reduction which is accomplished by intelligently modifying the image from the lowest level of pixel data into higherlevel representations. In this research work we focused our attention on nine statistical measures (standard deviation, variance, skew, kurtosis, entropy, energy, inverse difference moment, contrast, and covariance). The above statistical measures were estimated on histograms of the original image (1st-order statistics). The majority of the endoscopic research has focused on methods applied to grey-level images, where only the luminance of the input signal is utilised. Endoscopic images contain rich texture and colour information. All texture descriptors were estimated for all planes in both RGB {R (red), G (green), B (blue)} and HSV {H (hue), S (saturation), V (intensity)} spaces, creating a feature vector for each descriptor Di ¼ (Ri, Gi, Bi, Hi, Si, Vi). However, the histogram of the original image carries no information regarding relative position of the pixels in the texture. An alternative scheme was proposed in this research study to extract texture features from the texture spectra in the chromatic and achromatic domains from each colour component histogram of the endoscopic images. The definition of texture spectrum employs the determination of the TU and TU number (NTU) values. The statistical features are then estimated on the histograms of the NTU transformations of the chromatic and achromatic planes of the image (R, G, B, H, S, V). An intelligent decision support system has been developed for endoscopic diagnosis based on a multiple-classifier scheme. Two intelligent classifier-schemes have been implemented in this research work. An adaptive neurofuzzy logic scheme that utilises an alternative to the centroid defuzzification method, namely AOB has been implemented while is then compared with an RBF network. This multiple-classifier approach using FI as a fusion method, provided encouraging results. Acknowledgements This research work was supported by the ‘‘IVPIntracorporeal Videoprobe’’, European Research Project, Contract no.: IST-2001-35169. References [1] A. Arena, M. Boulougoura, H.S. Chowdrey, P. Dario, C. Harendt, K.-M. Irion, V. Kodogiannis, B. Lenaerts, A. Menciassi, R. Puers, C. Scherjon, D. Turgis, Intracorporeal Videoprobe (IVP), in: L. Bos, S. Laxminarayan, A. Marsh (Eds.), Medical and Care Computing 2, vol. 114, IOS Press, 2005, pp. 167–174. [2] K. Arshak, E. Jafer, G. Lyons, D. Morris, O. Korostynska, A review of low-power wireless sensor microsystems for biomedical capsule diagnosis, Microelectron. Int. 21 (3) (2004) 8–19. [3] M. Boulougoura, E. Wadge, V.S. Kodogiannis, H.S. Chowdrey, Intelligent systems for computer-assisted clinical endoscopic image analysis, in: Proceedings of the Second IASTED International Conference on Biomedical Engineering, Innsbruck, Austria, 2004, pp. 405–408.
[4] N. Bourbakis, S. Makrogiannis, D. Kavraki, A neural network-based detection of bleeding in sequences of WCE images, in: Proceedings of the Fifth IEEE Symposium on Bioinformatics and Bioengineering (BIBE’05), 2005, pp. 324–327. [5] G. Economou, D. Lymberopoulos, E. Karvatselou, C. Chassomeris, A new concept toward computer-aided medical diagnosis—a prototype implementation addressing pulmonary diseases, IEEE Trans. Inform. Technol. Biomed. 5 (1) (2001) 55–66. [6] M. Gletsos, S. Mougiakakou, G. Matsopoulos, K. Nikita, A. Nikita, D. Kelekis, A computer-aided diagnostic system to characterize CT focal liver lesions: design and optimization of a neural network classifier, IEEE Trans. Inform. Technol. Biomed. 7 (3) (2003) 153–162. [7] Y. Haga, M. Esashi, Biomedical microsystems for minimally invasive diagnosis and treatment, Proc. IEEE 92 (2004) 98–114. [8] R. Haralick, Statistical and structural approaches to texture, IEEE Proc. 67 (1979) 786–804. [9] D.-C. He, L. Wang, Texture features based on texture spectrum, Pattern Recogn. 24 (5) (1991) 391–399. [10] D.S. Huang, The local minima free condition of feed-forward neural networks for outer-supervised learning, IEEE Trans. Syst. Man Cybern. 28B (3) (1998) 477–480. [11] D.S. Huang, S.D. Ma, Linear and nonlinear feed-forward neural network classifiers: a comprehensive understanding, J. Intell. Syst. 9 (1) (1999) 1–38. [12] D.S. Huang, Radial basis probabilistic neural networks: model and application, Int. J. Pattern Recogn. Artif. Intell. 13 (7) (1999) 1083–1101. [13] G. Idden, G. Meran, A. Glukhovsky, P. Swain, Wireless capsule endoscopy, Nature (2000) 405–417. [14] S. Karkanis, G. Magoulas, D. Iakovidis, D. Maroulis, N. Theofanous, Tumor recognition in endoscopic video images, in: Proceedings of the 26th EUROMICRO Confernce, Netherlands, 2000, pp. 423–429. [15] V.S. Kodogiannis, An efficient fuzzy based technique for signal classification, J. Intell. Fuzzy Syst. 11 (1–2) (2001) 65–84. [16] V. Kodogiannis, Computer-aided diagnosis in clinical endoscopy using neuro-fuzzy System, in: Proceedings of IEEE FUZZ, 2004, pp. 1425–1429. [17] V.S. Kodogiannis, Intelligent classification of bacterial clinical isolates in vitro, using an array of gas sensors, J. Intell. Fuzzy Syst. 16 (1) (2005) 1–14. [18] V. Kodogiannis, M. Boulougoura, E. Wadge, An intelligent system for diagnosis of capsule endoscopic images, in: Proceedings of the Second International Conference on Computational Intelligence in Medicine and Healthcare (CIMED 2005), Portugal, 2005, pp. 340–347. [19] V. Kodogiannis, M. Boulougoura, E. Wadge, Neurofuzzy-based diagnosis of capsule endoscopic images, in: Proceedings of the 2005 IEEE International Conference on Intelligent Computing (ICIC2005), China, 2005, pp. 3524–3533. [20] B. Kosko, Neural Networks and Fuzzy Systems: A Dynamical Systems Approach to Machine Intelligence, Prentice-Hall, Englewood Cliffs, NJ, 1992. [21] B. Kovalerchuk, E. Vityaev, J.F. Ruiz, Consistent knowledge discovery in medical; diagnosis, IEEE Eng. Med. Biol. Mag. 19 (4) (2000) 26–37. [22] S. Krishnan, P. Goh, Quantitative parameterization of colonoscopic images by applying fuzzy technique, in: Proceedings of the 19th International Conference on IEEE/EMBS, vol. 3, 1997, pp. 1121–1123. [23] S. Krishnan, C. Yap, K. Asari, P. Goh, Neural network based approaches for the classification of colonoscopic images, in: Proceedings of 20th International Conference on IEEE Engineering in Medicine and Biology Society, vol. 2, 1998, pp. 1678–1680. [24] S. Krishnan, X. Yang, K. Chan, S. Kumar, P. Goh, Intestinal abnormality detection from endoscopic images, in: Proceedings of the International Conference on of the IEEE on Engineering in Medicine and Biology Society, vol. 2, 1998b, pp. 895–898.
ARTICLE IN PRESS V.S. Kodogiannis et al. / Neurocomputing 70 (2007) 704–717 [25] S. Krishnan, X. Yang, K. Chan, P. Goh, Region labeling of colonoscopic images using fuzzy logic, in: Proceedings of the First Joint BMES/EMBS Conference Serving Humanity, Advancing Technology, vol. 2, 1999, p. 1149. [26] S. Krishnan, P. Wang, C. Kugean, M. Tjoa, Classification of endoscopic images based on texture and neural network, in: Proceedings of the 23rd Annual IEEE Internaional Conference in Engineering in Medicine and Biology, vol. 4, 2001, pp. 3691–3695. [27] L.I. Kuncheva, Fuzzy Classifier Design, Physica-Verlag, Wurzberg, 2000. [28] S. Liangpunsakul, V. Chadalawada, D. Rex, D. Maglinte, J. Lappas, Wireless capsule endoscopy detects small bowel ulcers in patients with normal results from state of the art enteroclysis, Am. J. Gastroenterol. 98 (6) (2003) 1295–1299. [29] G. Magoulas, V. Plagianakos, M. Vrahatis, Neural network-based colonoscopic diagnosis using on-line learning and differential evolution, Appl. Soft Comput. 4 (2004) 369–379. [30] D.E. Maroulis, D.K. Iakovidis, S.A. Karkanis, D.A. Karras, CoLD: a versatile detection system for colorectal lesions endoscopy videoframes, Comput. Methods Programs Biomed. 70 (2003) 151–166. [31] J. Mendel, Uncertain Rule-Based Fuzzy Logic Systems, Prentice Hall, Englewood Cliffs, NJ, 2001. [32] K. Nagasako, T. Fujimori, Y. Hoshihara, M. Tabuchi, Atlas of Gastroenterologic Endoscopy by High-resolution Video-Endoscopy, IGAKU-SHOIN Ltd., 1998. [33] D. Panescu, Emerging Technologies an imaging pill for gastrointestinal endoscopy, IEEE Eng. Med. Biol. Mag. 4 (2005) 12–14. [34] C. Pattichis, C. Schizas, L. Middleton, Neural network models in EMG diagnosis, IEEE Trans. Biomed. Eng. 42 (5) (1995) 486–496. [35] C. Pena-Reyes, M. Sipper, Evolutionary computation in medicine: an overview, Artif. Intell. Med. 19 (1) (2000) 1–23. [36] E. Shortliffe, B. Buchanan, E. Feigenbaum, Knowledge engineering for medical decision making: a review of computer-based clinical decision aids, Proc. IEEE 67 (1979) 1207–1224. [37] P. Spyridonos, F. Vilarin˜o, J. Vitria`, P. Radeva, Identification of intestinal motility events of capsule endoscopy video analysis, in: Proceedings of Advanced Concepts for Intelligent Vision Systems Conference, Antwerp, Belgium, 2005, pp. 575–580. [38] P. Szczepaniak, P. Lisboa, J. Kacprzyk, Fuzzy Systems in Medicine, Springer, Berlin, 2000. [39] H.-N. Teodorescu, A. Kandel, L. Jain, Fuzzy and Neuro-Fuzzy Systems in Medicine, CRC Press, Boca Raton, FL, 1999. [40] F. Vilarin˜o, P. Spyridonos, J. Vitria`, P. Radeva, Self organized maps for intestinal contractions categorization with wireless capsule video endoscopy, in: Proceedings of the Third European Medical and Biological Engineering Conference, EMBEC’05 Prague, 2005. [41] E. Wadge, M. Boulougoura, V. Kodogiannis, Computer-assisted diagnosis of wireless-capsule endoscopic images using neural network based techniques, in: Proceedings of the 2005 IEEE International Conference on Computational Intelligence for Measurement Systems and Applications, CIMSA, 2005, pp. 328–333. [42] D. Walter, C. Mohan, ClaDia: a fuzzy classifier system for disease diagnosis, in: Proceedings of the 2000 Congress on Evolutionary Computation, 2000, pp. 1429–1435.
717
[43] L.X. Wang, Adaptive Fuzzy Systems and Control, Prentice-Hall, Englewood Cliffs, NJ, 1994. Vassilis S. Kodogiannis was born in Heraklion, Greece at 1966. He received the Electrical Engineering Degree from University of Thrace in 1990, the M.Sc. in VLSI Systems Engineering from UMIST, in 1991 and the Ph.D. in Electrical Engineering from HYPERLINK "http://www.liv. ac.uk" Liverpool University in 1994. He is currently Senior Lecturer in the School of Computer Science at University of Westminster, London. His research interests are in the areas of Intelligent Systems and Image Processing. He was the principal investigator for the ‘‘Intracorporeal Video Probe’’ European Research Project. Maria G. Boulougoura was born in Athens, Greece, in 1978. She received the diploma degree in Electrical and Computer Engineering from University of Thrace in 2002. She worked as a research assistant for the ‘‘Intracorporeal Video Probe’’ European Research Project, September 2002–November 2004. She has recently finished her M.Phil. Degree at Westminster University with a Dissertation ‘‘Computer-aided Diagnosis in Wireless-Capsule Endoscopy using Intelligent Systems’’. She is currently working at Siemens Development Innovations and Projects Department in Athens, Greece. John. N. Lygouras was born in Kastoria, Greece, in 1955. He received the Electrical Engineering Degree from University of Thrace, Greece in 1981 and the Ph.D. in Electrical Engineering from the same University in 1990. He is currently Associate Professor in the Department of Electrical and Computer Engineering at University of Thrace. His research interests are in the areas of robotics, image processing and analogue/digital electronic circuits. Ilias Petrounias was born in Athens, Greece, in 1965. He received the Electrical Engineering Degree from National University of Athens, Greece in 1981, the M.Sc. and Ph.D. in Computer Science from UMIST in 1991 and 1994, respectively. He is currently Senior Lecturer in the School of Informatics at University of Manchester. His research interests are in the areas of Data Mining and uncertainty in information systems.