Evidential segmentation scheme of multi-echo MR images for the detection of brain tumors using neighborhood information

Information Fusion 5 (2004) 203–216 www.elsevier.com/locate/inﬀus Evidential segmentation scheme of multi-echo MR images for the detection of brain t...

Download PDF

787KB Sizes 0 Downloads 36 Views

Report

PDF Reader
Full Text

Information Fusion 5 (2004) 203–216 www.elsevier.com/locate/inﬀus

Evidential segmentation scheme of multi-echo MR images for the detection of brain tumors using neighborhood information A.-S. Capelle a, O. Colot a

b,*

, C. Fernandez-Maloigne

a

Laboratoire Signal Image et Communications (SIC), UMR CNRS 6615, Universit e de Poitiers, B^at. SP2MI, Boulevard Marie et Pierre Curie, B.P. 30179, 86962 Futuroscope-Chasseneuil Cedex, France b Laboratoire d’Automatique I3 D, FRE CNRS 2497, B^at. P2, Cit e Scientiﬁque, Universit e des Sciences et Technologies de Lille, 59655 Villeneuve d’Ascq Cedex, France Received 3 April 2003; received in revised form 9 October 2003; accepted 9 October 2003 Available online 13 November 2003

Abstract In this paper we propose and study an evidential segmentation scheme of multi-echo MR images for the detection of brain tumors. We show that the modeling by means of evidence theory is well suited to the processing of redundant and complementary data as the MR images. Moreover neighborhood relationship between voxels is taken into account via Dempster’s combination rule. We show that using this information improves the classiﬁcation results previously obtained and leads to a real region-based segmentation. Moreover, the combination of spatial information allows to compute a measure of conﬂict, which reﬂects the spatial organization of the data: the conﬂict is higher at the boundaries between diﬀerent structures. Thus, it provides a new source of evidence that the specialist can aggregate with the segmentation results to soften its own decision. 2003 Elsevier B.V. All rights reserved. Keywords: Data fusion; Evidence theory; Decision; Multi-modality imaging; Segmentation; Neighborhood relationship; Conﬂict

1. Introduction The technical progresses in medical imaging provide to physicians many tools for the observation of the human anatomy or functional behavior of organs. Different modalities of imagery exist: classical radiology, echography, functional positron emission tomography (PET) or magnetic resonance (MR) imaging, etc. Thus, considering that the diﬀerent sources highlight particular regions, tissues or pathologies, the physicians may use multi-source imaging and take advantages of each. In this way, the multi-source approaches give the opportunity to reﬁne the diagnosis. In addition to the possibility of choosing the most adapted technology, the physicians may simultaneously analyze diﬀerent images and combine them in order to obtain complementary and/or redundant data. We distinguish two kinds of fusion. The ﬁrst one is the fusion of images acquired by a single imaging technique but using diﬀerent acquisitions parameters (for instance multi-echo MR images).

*

Corresponding author. Tel.: +33-320436928; fax: +33-320436567. E-mail address: [email protected] (O. Colot).

1566-2535/$ - see front matter 2003 Elsevier B.V. All rights reserved. doi:10.1016/j.inﬀus.2003.10.001

The second one is the fusion of images acquired with diﬀerent imaging techniques such as the anatomical MR images and the functional PET. The use of diﬀerent techniques for the analysis of a given pathological region of interest (ROI) such as a tumor, provides a large amount of information that physicians naturally integrate and merge for their diagnosis. Data fusion appears to be a growing research ﬁeld of the last decade [1–4]. The interest in data fusion is to obtain an information synthesis by taking into account diﬀerent pieces of evidence. Data coming from diﬀerent sources (sensors, observers,. . .) are usually redundant but also complementary: the observed scenes are the same but they are acquired from diﬀerent points of view and thus, the amount of available information is increased. The usual characteristic of the data is that they are imprecise, uncertain and incomplete. They emanate from an observer (which has its own interpretation of the scene) or from a sensor (data are distorted by the sensor itself, by the numerical acquisition device including the associated image formation algorithm,. . .). In such a context, the aim of the fusion process is to synthesize a more reliable and elaborated information and thus, to improve the decision.

204

A.-S. Capelle et al. / Information Fusion 5 (2004) 203–216

Fusion techniques are based on various theories: probabilistic fusion, Bayesian inference, fuzzy sets theory, possibility theory and evidence theory. In this paper, we focus on the evidence theory, also called Dempster–Shafer theory [5]. Although often used in pattern recognition until now, there are very few applications of the evidence theory in medical imaging for images analysis and diagnosis. In [6,7], evidence theory is used for the segmentation of MR brain images. In [8], Suh et al. use it for segmentation and visualization of the left ventricle of the heart. In [9], Bloch classiﬁes brain tissues in pathological dualecho MR images. In this paper, we propose an evidential scheme for the segmentation of multi-echo MR images for detection and 3D visualization of brain tumors. The innovation of the proposed segmentation process is to include spatial neighborhood relationship between the voxels by means of a segmentation scheme based on the use of evidence theory. In Section 2, we present the main aspects of the theory. In Section 3, the use of the theory is more precisely discussed. The segmentation scheme is then described in Section 4. In Section 5, we discuss the results and the consequences of the introduction of the spatial neighborhood relationship in the segmentation process, as the conﬂicting information, in particular. We show that conﬂicting information is not only the consequence of data fusion but is also an information source about the data spatial organization.

2. The fundamentals of evidence theory Evidence theory, or the theory of belief functions, was initially introduced by Dempster’s works on the concept of lower and upper bounds for a set of compatible probability distributions [5]. In [10], Shafer formalizes the theory and shows the advantage of using belief functions to model imprecise and uncertain data. Diﬀerent interpretations of the native ‘‘Dempster– Shafer’’ theory successively appeared [11]. Smets and Kennes [12] deviate from the initial probabilistic interpretation of the evidence theory with the Transferable Belief Model (TBM) giving a clear and coherent interpretation of the underlying concept of the theory. 2.1. Belief structures Within the context of the evidence theory, the imprecise and uncertain data are modeled by belief structures called credibility and plausibility, each of them derived from the elementary belief structure m (also called basic belief assignment, bba). The existence and the use of these functions involve the deﬁnition of the frame of discernment X which is composed of N ex-

haustive and exclusive hypotheses Hn , solutions of the problem: X ¼ fH1 ; H2 ; . . . ; HN g:

ð1Þ

From the frame of discernment, we deﬁne the power set of all 2N propositions A deﬁned on X, 2X : 2X ¼ f;; H1 ; H2 ; . . . ; fH1 [ H2 g; fH1 [ H3 g; . . . ; Xg:

ð2Þ

Evidence theory provides a theoretical and mathematical framework to quantify a piece of opinion of an agent that a proposition A belongs to the actual world, or equivalently that a given proposition is true [13]. Note that A can be either a singleton Hn or a disjunction of hypotheses. This is one of the main diﬀerences with respect to the probabilistic approaches which only consider the singleton case. The bba m assigned by a source S is deﬁned by m : 2X ! ½0; 1 :

ð3Þ

In particular, this function veriﬁes X mðAÞ ¼ 1:

ð4Þ

AX

If we consider the closed-world assumption [14], in opposition to the open-world assumption, the bba veriﬁes mð;Þ ¼ 0, such hypothesis meaning that the solution belongs to the frame of discernment. The quantity mðAÞ measures the amount of belief that is exactly committed to A. In the case of disjunction of hypotheses (A being a compound hypothesis) it is impossible to re-distribute a part of evidence mðAÞ in any subset of A by lack of information. Only the injection of new information allows to re-assign more precisely the belief. An element A X such as mðAÞ 6¼ 0 is then called focal element. Let us denote F the set of focal elements associated to a bba m. From the bba m, the credibility (Bel) and the plausibility (Pl) functions are deﬁned by X BelðAÞ ¼ mðBÞ ð5Þ BA;B6¼;

and PlðAÞ ¼

X

mðBÞ:

ð6Þ

A\B6¼;

The quantity BelðAÞ can be interpreted as the total amount of belief in the proposition A. The plausibility PlðAÞ quantiﬁes the maximum amount of belief potentially assigned to A. Thus the credibility and the plausibility are dual notions: the plausibility is deﬁned by where A is the complementary PlðAÞ ¼ BelðXÞ BelðAÞ of A. Note that the functions m, Bel and Pl are three diﬀerent representations of the same information [10]. The transformation of any of these functions into another is possible thanks to the M€ obius transformation [15]. Note that if the set of focal elements F is only

A.-S. Capelle et al. / Information Fusion 5 (2004) 203–216

composed of singleton hypotheses then m, Bel and Pl are three equivalent functions. An important key point within the evidence theory framework is to properly model the knowledge given by diﬀerent sources of information S in order to initialize the associated bba m. Generally, the models depend on the considered problem. Therefore, we can distinguish two main approaches: the distance-based models initially proposed by Denœux [16–18], which take into account the neighborhood information and the models based on likelihood functions [1,4,10,19]. These diﬀerent models will be described more precisely in Section 4.2. 2.2. Belief discounting If a source S, associated to a belief function m is considered as not fully reliable, the belief issued from S can be attenuated thanks to a discounting process. If we call a the coeﬃcient which represents the belief we have about the reliability of the source S, the discounted belief function ma is deﬁned by ma ðAÞ ¼ amðAÞ

8A X; A 6¼ X;

ma ðXÞ ¼ 1 a þ amðXÞ;

ð7Þ ð8Þ

with 0 6 a 6 1. 2.3. Combination In the case of classiﬁcation problems dealing with uncertain and imprecise data, it is often interesting to aggregate the information coming from diﬀerent sources in order to obtain more relevant information. Evidence theory provides reliable tools to combine the knowledge given by diﬀerent sources. The obtained information is the synthesis of all sources. Thus, the decision process is more conﬁdent because it takes into account the whole information of the sources, partially redundant and complementary. The orthogonal rule also called Dempster’s rule of combination [10] is the ﬁrst combination deﬁned within the framework of evidence theory. Let us denote m1 ; . . . ; mJ , J masses of belief coming from J distinct sources. 1 The belief function m resulting from the combination of the J sources by means of Dempster’s combination rule is deﬁned by m ¼ m1 mi mJ :

ð9Þ

The operator of Dempster’s combination is associative and commutative. For all A of X, m ðAÞ is given by m ðAÞ ¼ 1

m\ ðAÞ 1k

8A X;

ð10Þ

Note that Smets and Kruse detailed in [20] the notion of distinctness: the belief of a particular source should not interfere with the belief of any other source.

205

where m\ is the conjunctive combination deﬁned by X m1 ðA1 Þ m2 ðA2 Þ mJ ðAJ Þ 8A X; m\ ðAÞ ¼ A1 \\AJ ¼A

ð11Þ and where the term k is given by X m1 ðA1 Þ m2 ðA2 Þ mJ ðAJ Þ: k¼

ð12Þ

A1 \\AJ ¼;

The term k, with 0 6 k 6 1, can be interpreted as a measure of the conﬂict between the J sources to combine and is directly taken into account in the combination as a normalization factor. It also represents the mass attributed to the empty set if the masses are not normalized. Dempster’s rule has been justiﬁed theoretically by several authors [21,22]. However, the normalization step was also criticized [22–24]. It is very important to take into account the value of the normalization term: when it is high (k 1), combining the sources is a non-sense leading to incoherence [14,24] and involving counter-intuitive behaviors. Moreover, when k ¼ 1 the sources are in complete opposition and the data fusion is impossible. Diﬀerent solutions are proposed to deal with the conﬂict. In the case of reliable sources, Smets [20] supposes, as Dempster, that the conﬂict can only come from a bad deﬁnition of the frame of discernment. In other words, the problem is an ill-posed problem and Smets proposes to avoid the normalization step. In this case, the value k represents the mass assigned to one or several hypotheses that have not been initially taken into account. The advantage of this approach is that it allows us a possible revision of the initial frame of discernment. Thus, Smets proposes a combination deﬁned as follow: mS ðAÞ ¼ m\ ðAÞ 8A X; ð13Þ mS ð;Þ ¼ k: Note that a similar reasoning is proposed by Yager [25]. In the case of non-reliable sources, solutions are proposed by Yager [25] and Dubois and Prade [3]. In [26], Lefevre et al. deﬁne a formalism to describe a family of combination operators and develop a generic framework in order to unify the diﬀerent classical rules of combination previously cited. 2.4. Decision making Generally, for most applications, the decision that have to be taken is to choose a simple hypothesis. Within the particular context of the TBM, which deviates from the probabilistic approach of the theory, Smets opposes the credal level to the pignistic level. The ﬁrst one consists in modeling the knowledge and combining sources. The second one is fully oriented toward the decision-making [20]. Smets proposes a decision rule based on a probability function called pignistic probability function [27] deﬁned by

206

A.-S. Capelle et al. / Information Fusion 5 (2004) 203–216

X

BetP ðHn Þ ¼

AX;Hn 2A

mðAÞ jAj ð1 mð;ÞÞ

3.1. Problematic 8Hn 2 X;

ð14Þ

where jAj is the cardinality of A. Let denote by X a pattern to be assigned in one of the N hypothesis of the frame of discernment X. In the context of the decision theory, we consider A ¼ fa1 ; . . . ; aN g a ﬁnite set of actions. Typically the action ai is associated to the assignment to the hypothesis Hi . We deﬁne by kðai jHn Þ the loss which occurs when we select ai whereas the truth belongs to the hypothesis Hn . The Bayesian decision rule deﬁnes the expected loss associated to each possible action ai 2 A by X RBetP ðai Þ ¼ kðai jHj ÞBetP ðHj Þ: ð15Þ Hj 2X

For a pattern vector X , the pignistic decision rule is then deﬁned by DBetP ðX Þ ¼ ai

with

ai ¼ arg min RBetP ðaj Þ:

ð16Þ

aj 2A

If we consider the simplest case where the losses are assumed to be equal to 1 for a bad decision and 0 for correct decision ðkðai jHj Þ ¼ 1 di;j ), the pignistic decision rule is written as DBetP ðX Þ ¼ ai

with

ai ¼ arg max BetP ðHj Þ:

ð17Þ

The problems that physicians and the oncologists encounter, within the framework of brain tumors treatment, is to determine exactly the areas of extension and the volume of tumors. These parameters, beside the nature of the lesion, allow them to plan treatments. This work aims to supply eﬃcient segmentation and brain volume estimation tools to help the physicians in their diagnosis. For that purpose, we have, for a same patient, several MR brain volumes. Each volume, formed by a stack of slices, was acquired with the same MR scanner but with diﬀerent acquisition parameters (e.g. T1 echo, T2 echo). Thus, each volume can also be considered as an information source and the data are numerous, redundant and complementary. The idea is to merge the data, as the specialist mentally does, in order to obtain a reliable segmentation and consequently a good estimation of the brain tumor volume. 3.2. The interest of using evidence theory To realize the data fusion we adopt the evidence theory which is, also well suited to the problem of brain tumors detection and to the nature of the used data:

Hj 2X

Other decision rules can be deﬁned [28]. Considering losses f0; 1g, a lower and a upper decision rules are deﬁned by DH ðX Þ ¼ ai

with

ai ¼ arg max BelðHj Þ;

ð18Þ

Hj 2X

DH ðX Þ ¼ ai

with

ai ¼ arg max PlðHj Þ:

ð19Þ

Hj 2X

Thus, maximizing the credibility corresponds to a pessimistic decision and maximizing the plausibility corresponds to an optimist decision. Within the case of pattern recognition, it is often interesting to add a new action ar , called rejection, used when uncertainty is too high [29,30]. The rejection action ar is then associated to a constant cost kðar jHj Þ ¼ Cr . If the function WðÞ represents the credibility or the plausibility or the pignistic probability, then the decision rule is deﬁned by 8 WðHj Þ < 1 Cr ; < ar ; if max Hj 2X DðX Þ ¼ : ai ¼ arg max WðHj Þ; with WðHi Þ P 1 Cr : Hj 2X

ð20Þ 3. MR image segmentation and evidence theory This section deals with a particular application in the ﬁeld of medical imaging. It concerns the segmentation of multi-echo images for detection and 3D visualization of brain tumors.

• The images are obtained by the same imaging technique but with diﬀerent acquisition parameters. We are within the framework of multi-echo analysis. The same scene and thus the same anatomical regions are observed and the information to be treated are redundant. • Every echo highlights some elements and some particular tissues that the others do not. For instance, the T1 echo is an anatomical modality, thus the anatomical structures are well diﬀerentiated. The T2 echo highlights, totally or partially, tumors and œdemas. Using several diﬀerent modalities thus implies the complementarity of data. • Although the scanner eﬃciency is increasing, we have to keep in mind that the images are issued from an acquisition system and therefore that their quality depends on the quality of the scanner. The images are, by nature, corrupted by noise (noise coming from the acquisition system or the patient itself, quantization noise). The images are thus uncertain and imprecise. • Furthermore, we are particularly interested in tumors with badly deﬁned frontiers. Such tumor boundaries are already uncertain. • Finally, we have seen in Section 2 that the conﬂict is an important information in evidence theory, information given by the combination process. However, from our point of view, the conﬂict is not only a consequence of the combination process but it is also an information source which increases the knowledge on

A.-S. Capelle et al. / Information Fusion 5 (2004) 203–216

the problem. We will see in Section 5.3.3 a possible use of the conﬂict in the segmentation process.

4. Segmentation scheme We propose an automatic method for the segmentation of pathological multi-echo MR volumes. The method relies on two main points. Firstly, we model data according to an evidential parametric model. In particular, we propose the use of three models. Secondly, we make use of spatial information by a weighted Dempster’s combination rule. 4.1. Notations Let xi denote the gray level of a voxel located on site s of the MR volume Vi . If we consider the set of p diﬀerent echoes, the pattern vector associated to the voxel in s is deﬁned by X ¼ ½x1 ; . . . ; xp :

ð21Þ

The space of characteristics X is a p-dimensional space deﬁned by the grey level of the volumes. Moreover, let Xt be a training set deﬁned by Xt X. Let X ¼ fH1 ; . . . ; HN g be the set of hypotheses (the solutions of the problem) where N is the constant number, arbitrarily ﬁxed. X is also called the frame of discernment. The classiﬁcation problem is to divide the volume into N classes. Ideally, each class represents a particular anatomical structure. Usually, N is equal to 5 in order to segment the volume into white matter (WM), gray matter (GM), cerebrospinal ﬂuid (CF) and two classes for tumor and œdema. 4.2. Evidential modeling We propose the use of three evidential models: a distance-based model introduced by Denœux [16,31] and two models based on a likelihood function, the ﬁrst one was deﬁned by Shafer [10] and the second one by Appriou [4]. 4.2.1. Denux’s model Denœux’s model is a kNN-theoretic distance-based classiﬁer. For each pattern X 2 X to classify, this model considers k neighbors of the training set Xt as k independent information sources which are used in order to determine the class of membership of X . The information brought by a training neighbor Xs , with s 2 ½1; . . . ; k , is modeled by a bba ms deﬁned on X. If Xs belong to the hypothesis Hn , a weighted fraction of the unit mass is assigned to the singleton hypothesis Hn and the rest to the frame of discernment X (so it quantiﬁes the degree of ignorance). The mass ms ðfHn gÞ is then a decreasing function of the distance ds between X and X s :

207

ms ðHn Þ ¼ a/n ðds Þ; ms ðXÞ ¼ 1 a/n ðds Þ;

ð22Þ

where 0 < a < 1 is a constant and /n a monotonous decreasing function such that /n ð0Þ ¼ 1 and limd!inf /n ðdÞ ¼ 0. We use for /n the following exponential function: /n ðds Þ ¼ expðcn ðds Þ2 Þ;

ð23Þ

where cn > 0 depends on the class Hn . Note that in [17], Zouhal and Denœux propose to optimize the parameter cn by the minimization of an error criterion. Since the k neighbors are considered as k independent sources of information modeled by k bba ms with s ¼ ½1; . . . ; k , a decision on the membership class of the vector X can be made after the aggregation of the whole information by Dempster’s rule of combination. In order to cope with high computation time, Denœux proposes a ‘‘prototypes’’ version. With this version, each hypothesis Hn is associated to a center (a prototype) xn which is considered as a source of information. Thus N bbas mn for n ¼ ½1; . . . ; N are built according to Eq. (22) where the distance is the distance between the pattern X and the center xn . The ﬁnal bba m is obtained by merging the N initial bbas. 4.2.2. Shafer’s model Shafer’s model [10] is based on the likelihood function. We suppose that the conditional a priori probability function f ðX jHn Þ is known; thus the conditional likelihood associated to the pattern X is deﬁned by LðHn jX Þ ¼ f ðX jHn Þ. The bba is deﬁned thanks to the knowledge of all the hypotheses Hn : it is a global method. Moreover it is a consonant method based on two axioms. Firstly, the plausibility of a simple hypothesis Hn is proportional to its likelihood. Let us denote c a normalization factor. The plausibility is thus given by PlðHn Þ ¼ cLðHn jX Þ 8Hn X:

ð24Þ

Secondly, if we assume the plausibility to be consonant, Pl veriﬁes the condition PlðA \ BÞ ¼ max½PlðAÞ; PlðBÞ

8A; B X:

ð25Þ

The plausibility of a set A is thus given by PlðAÞ ¼ c max LðHn jX Þ:

ð26Þ

Hn 2A

As the plausibility function veriﬁes PlðXÞ ¼ 1, we deduce max LðHn jX Þ PlðAÞ ¼

Hn 2A

max LðHn jX Þ

8A X:

ð27Þ

Hn 2X

The bba function is ﬁnally determined thanks to the M€ obius transformation [15].

208

A.-S. Capelle et al. / Information Fusion 5 (2004) 203–216

4.2.3. Appriou’s model Appriou’s model, as Shafer’s one, is based on likelihood functions LðHn jX Þ. The models deﬁned by Appriou satisfy three axioms [1]:

are very weak and consequently a too large amount of belief is placed on the set X.

• consistency with the Bayesian approach; • separability of the evaluation of the hypotheses Hn ; • consistency with the probabilistic association of sources Sj .

In this part, we deal with the parameter estimation associated to the various belief models described above. In particular, the building of Denœux’s model needs to have a training set. For Appriou and Shafer models, likelihood functions need to be estimated. We suppose that each pattern X is assigned to one hypothesis Hn depending only on its intensity level. If we assume that each tissue class of the MR volume can be modeled by a Gaussian distribution [36–38], the conditional density function is deﬁned by 1 f ðX jHn Þ ¼ p=2 ð2pÞ jRn j1=2 1 T exp ðX ln Þ R1 ð33Þ n ðX ln Þ ; 2 where ln and Rn are respectively the mean p-vector and the p p covariance matrix associated to the hypothesis Hn . In the probabilistic context, X is a particular realization of the stochastic process. Let Xt be the set of pattern vectors used to estimate the means and the variances. This training set is such as Xt X where X is the set of all the pattern vectors of the volume. If we suppose the volume is composed of N distinct classes, associated to the hypotheses Hn for n ¼ 1; . . . ; N and that the realizations are independent and identically distributed (i.i.d.), we are in presence of a mixture model described by the mean, the variance and the proportion of each class. Let H represents these parameters. Thus the conditional probability of Xt is deﬁned by X X f ðX jHn Þ P ðHn Þ: ð34Þ P ðXt jHÞ ¼

An exhaustive search shows that only two models satisfy these axioms [1]. For the ﬁrst model, each information source Sj is associated to N elementary bbas deﬁned by 8 < mnj ðHn Þ ¼ 0; mnj ðHn Þ ¼ anj f1 Rj LðHn jxj Þg; ð28Þ : mnj ðXÞ ¼ 1 anj f1 Rj LðHn jxj Þg: For the second model, the bbas are deﬁned by 8 < mnj ðHn Þ ¼ anj Rj LðHn jxj Þ=f1 þ Rj LðHn jxj Þg; mnj ðHn Þ ¼ anj f1 þ Rj LðHn jxj Þg; : mnj ðXÞ ¼ 1 an ;

ð29Þ

where Rj is a normalization factor constrained by 1

Rj 2 ½0; ðsup max fLðHn jxj ÞgÞ ; xj

n2½1;N

ð30Þ

and anj is a reliability factor depending on the hypothesis Hn and on the source Sj . If the conﬁdence into the training is high, Appriou proposes to ﬁx anj to 1 and not to discount the bba. Otherwise, anj is ﬁxed to 0:9. Other methods were proposed to automatically compute the reliability factors [32,33]. A mass m is ﬁnally obtained by computing the orthogonal sum of the diﬀerent bbas mnj and mj : mj ðÞ ¼ a mnj ðÞ;

ð31Þ

n

mðÞ ¼ a mj ðÞ:

ð32Þ

j

According to [34], the ﬁrst model seems to be preferable because it is more consistent with the Generalized Bayes Theorem (GBT) introduced by Smets [35]. In the sequel, we will only deal with the ﬁrst model proposed by Appriou. A particularity of Appriou’s models is their multisensor aspect. Thus, in our application, each MR echo is considered as an information source, and, therefore, each echo is treated separately before to be merged according to the fusion process in order to build a synthesized information. This is especially interesting for the tumor segmentation with MR images which provide complementary and redundant image information. Moreover this choice is guided by the fact that with a vectorial version, the values of the likelihood functions

4.3. Model parameter estimation

X 2Xt Hn 2H

The parameters of this mixture model are then estimated using the expectation–maximization (EM) algorithm [39]. The deﬁnition of the training set Xt is discussed in Section 5.1. The likelihood of Shafer’s model is then deﬁned using LðHn jX Þ ¼ f ðX jHn Þ and Eq. (33). For Appriou’s model, we need to estimate p likelihood functions Lðjxp Þ. They are estimated by means of the EM algorithm while preserving a same frame of discernment for each of the information sources. Concerning Denœux’s model, each estimated mean and variance are associated with a class prototype. For each pattern X , the belief function mn (Eq. (22)) is computed using the Mahalanobis distance deﬁned by ds2 ¼ ðX ln ÞT R1 n ðX ln Þ:

ð35Þ

The optimized cn (Eq. (23)) parameters are computed using the classiﬁcation obtained by the EM algorithm as the training set Xt and the method proposed in [17].

A.-S. Capelle et al. / Information Fusion 5 (2004) 203–216

4.4. Introduction of spatial information The diﬀerent evidential models presented above allow to obtain a classiﬁcation of all the patterns of the volume: the knowledge about each pattern is modeled by a bba and a decision (assignment to a class) is taken. However, such a scheme classiﬁes each data independently, without taking into account any information provided by its spatial neighborhood. The originality of the proposed scheme is to introduce spatial information within the framework of evidence theory. Indeed, within the framework of image segmentation, we implicitly suppose that each region (equivalent to a class in the posed problem) of the image shares the same properties. Thus, in a particular homogeneous region, each pattern––modeled by a bba–– reinforces the knowledge about its neighbors. Moreover, if a corrupted pattern is present in this region, the knowledge provided from its neighbors attenuates its belief such as a denoising process. The introduction of spatial information in the modeling process by merging data gives then a more accurate information about the patterns and leads to a more reliable decision. Moreover, we can now assimilate this process to a real segmentation process and not only to a classiﬁcation process. Thus, we denote m the bba of a pattern X and mk the bba of the kst spatial neighbor Xk of X . The basic idea is to consider that a pattern X can probably be assigned to Hn if the evidence about its K spatial neighbors in Hn is high. If we denote m the bba of the pattern X issued from the introduction of spatial information, we propose to deﬁne a distance-based bba m0 by m0 ¼ m ma11 maKK ;

ð36Þ

mai i ,

for i ¼ 1; . . . ; K, is the discounted bba mi . where Thus, the resulting mass m0 depends on m––the bba associated to the pattern X ––but also on the weighted masses of its K neighbors. Intuitively, we consider that the contribution of a neighbor depends on its distance to the pattern X to assign to one of the competitive hypotheses supported by the pattern X itself and its neighbors. Thus, we propose to deﬁne ak by 2

ak ¼ expfðdk Þ g;

ð37Þ

where dk is the Euclidian distance between X and its spatial neighbor Xk . Besides weighting the bba according to the position of the neighbors, another advantage of using a distancebased function is to take into account the true dimensions of the voxels. This is particularly interesting in MR imaging because the voxels are usually anisotropic: the z-distance of the voxel both includes the slice thickness and the distance between slices; consequently the z-distance is often two time larger than the x- or y-distance. The introduction of spatial information is then consis-

209

tent with the data and avoids the approximations due to digitization. Given the mass m0 for each pattern X of a considered volume, the decision rule consists in choosing a hypothesis Hi which veriﬁes Eq. (20).

5. Experiments and results 5.1. Image description The MR images are issued from the 1.5 T scanner of the Regional University Hospital Center (CHRU) of Poitiers. For a same patient, we have a pair of MR volumes of the same brain region. Depending on the cases and the needs of the physicians, the pairs of echoes are fT1 ; T2 g, {T1 ; T1 Gado 2} and fT1 Gado; T2 g. Let us note that the method can be easily extended to J sources with J > 2. Figs. 1(a) and (b) show a dual-echo image (respectively a T1 Gado echo and a T2 echo). Those images highlight both the redundancy and the complementarity of the data. The data were axially acquired. Each volume is composed of voxels having near-millimetric dimensions: 0.93 · 0.93 · 1.2 mm. The voxels are anisotropic but the thickness of each slice and the space between them are such as there does not exist any superposition between slices thus avoiding to corrupt the data quality. Note that acquisition protocols are such that the dimensions of the voxels and the acquired zones are the same for all the sequences, avoiding to match the diﬀerent volumes. Moreover, the segmentation scheme is applied to the restricted region of the brain (the brain is previously isolated from the whole data by a preliminary preprocessing based on morphological operations [40]). The segmentation scheme proposed in Section 4 is applied on six dual-echo volumes. For simplicity, we only present the results about a single patient but similar behaviors are observed with the other data volumes. For the particular presented volume, the expected anatomical regions are the WM, the GM, the CF, the tumor and the œdema. Thus, the number of classes N equals 5. The three models presented in Section 4.2 are used and their behavior compared. The parameter estimation is realized as described in Section 4.3. In order to obtain a completely automatic segmentation (no user interaction) and to be sure that all the classes are represented in the training set Xt , we deﬁne Xt ¼ X where X is composed of all the pattern vectors of the volume. For this particular volume, X is composed of more than 800 000 patterns.

2 We call T1 Gado a T1 type acquisition for which a contrast product, the gadolinium, was injected to the patient.

210

A.-S. Capelle et al. / Information Fusion 5 (2004) 203–216

Fig. 1. A dual-echo MR image. (a) T1 Gado echo (b) T2 echo.

The discounting parameters a (Eq. (22)) and anj (Eq. (28)) depend on the reliability we assess to the hypotheses Hn and in the training set Xt . However, the conﬁdence in the training set Xt , obtained by an unsupervised process, is not complete essentially due to the fact that image data characteristics depend on the acquisition conditions that we did not control. Consequently the bbas have to be discounted. Moreover, not to favor any hypothesis, we impose that these parameters are set to the same value. Thus, we propose to set all the discounting parameters to the value 0.95. Moreover, this constraint induces that a part of the belief is assigned to the set X which consequently avoids any possible situation of total conﬂict when the bbas are combined. The decisions about the class membership are taken using a loss equal to 1 in case of a wrong decision and 0 otherwise. 5.2. Comparison between the diﬀerent models 5.2.1. Without using spatial information In order to evaluate the impact of the neighborhood information on the diﬀerent models, we ﬁrst segment the volumes without using spatial information. In other

words, this is equivalent to set the parameters ak ¼ 0 for k in ½1; K (Eq. (36)) where K is the number of spatial neighbors. We present the ﬁnal segmentation results into ﬁve classes of a dual-echo fT1 Gado; T2 g. The decision is taken by maximizing the pignistic probability function. Fig. 2 represents the segmentation results obtained with the three evidential models. As we can observe, the results seem to be consistent with the requested anatomical structures, for all the used models. However, we observe two diﬀerent behaviors which correspond to the information used to build the bbas: a distance-based approach for Denœux’s model and a likelihood-based approach for Appriou and Shafer models. Fig. 3 represents the evolution of the rejection cost when the decision cost Cr increases from 0 to 1. We observe diﬀerent levels of rejection depending on the used model. Appriou’s model rejects more than Shafer’s model and Denœux’s model rejects more than Appriou’s model. Thus, we remark that this organization corresponds to the speciﬁcity of the models. The more it is speciﬁc, the less the rejection rate is high. Depending on the classiﬁcation strategy, the chosen model will be diﬀerent: with a careful approach, Denœux’s model is preferred, with an optimistic approach, Shafer’s model is preferred. 5.2.2. Introduction of spatial information As explained in Section 4.4, spatial information are introduced in the segmentation scheme via the combination of the bba of a voxel and the bbas of its neighbors. This combination provides additional information which allow a more accurate decision. Fig. 4 shows the segmentation obtained when we introduce spatial information via the combination of masses of belief. The visual quality of the segmentation seems to increase. In particular, the WM is better recognized when Denœux’s model is used. The segmented region is more homogeneous and seems to be less corrupted by misclassiﬁcations. We will detail more precisely in Section 5.3 the diﬀerences obtained with and without the use of spatial information.

Fig. 2. Segmentation results into 5 classes without using spatial information on a middle slice of the volume. From darker to lighter: CF, œdema, tumor, WM and GM. (a) Denœux’s model. (b) Appriou’s model. (c) Shafer’s model.

A.-S. Capelle et al. / Information Fusion 5 (2004) 203–216

211

Table 1 Tumor classiﬁcation result comparison (the tumor ground truth is composed of 72 405 points) Denœux Appriou Shafer

Fig. 3. Evolution of the rejection rate.

In order to analyze the quality of the diﬀerent models within the context of our medical application, we compare the ability of the models to detect the tumor and the œdema. For that purpose, we extract from the segmentation the regions corresponding to the tumor and the œdema. If we denote by 0, the label attributed to the darkest region of a segmentation and by 4 the label attributed to the lightest, then the tumor and œdema correspond to the biggest connex component labeled 1 and 2. Note that it is not possible to extract these regions in that way when we do not include neighborhood information: many small regions are still connected to the œdema and the tumor. Thus, the neighborhood information is necessary for our application with this data volume. The extracted ROI composed of the tumor and the œdema is compared with an expert segmentation. However, even if the expert frontiers are manually drawn by a radiologist, they could not be considered as ‘‘The Gold Standard’’ because of the variability of the expert decision; but they are still a good reference for quality evaluation. Note that this expert segmentation is never taken into account during the segmentation steps: the segmentation is fully automatic. The evaluation of evidential segmentations is made via diﬀerent classiﬁcation rates: true positives rate, the false positive rate and the false negative rate.

True positives

False positives

False negatives

69 857 (96.48%) 64 181 (88.64%) 67 288 (92.93%)

5233 (7.23%) 6500 (8.98%) 9708 (13.41%)

2548 (3.52%) 8224 (11.36%) 5117 (7.07%)

Table 1 summarizes the results. Considering the true positive rate, Denœux’s model provides the best detection rate (96.48% of the tumor volume is detected) even if the three models underestimate the tumor volume. However, we have to keep in mind that oncologists tend to over-estimate the tumor volumes. In the context of medical application, the only use of the true positive rate is not suﬃcient to evaluate a segmentation result. It is often interesting to consider the false positive rate and the false negative rate. In particular, when the ROI is pathological structure (as a lesion), it is critical to avoid false negative decisions. Considering this rate, Denœux’s model is again the most eﬃcient with a false negative rate equal to 3.52% of the ground truth tumor volume. Appriou and Shafer false negative rates are higher; 7.07% of tumor volume is not detected with Shafer’s model and 11.36% with Appriou’s model. The false positive rate information is less important in a tumor result evaluation. However, we notice that this rate is between 7.23% and 13.41%. Considering this rate, Shafer’s model provides the more careful segmentation. Finally, these rates are obtained without including any rejection class. The use of such a class should avoid some misclassiﬁcations. The study of this case shows that evidence theory seems to be well suited for tumor segmentation problems. The segmentation results are very encouraging and close to expert segmentation. Moreover, the use of spatial information increases the visual accuracy of the segmentation (see the results obtained with Denœux’s model). The inﬂuence of the spatial information is further detailed in the next section.

Fig. 4. Segmentation results with the introduction of spatial information (the pignistic decision rule is used). (a) Denœux’s model. (b) Appriou’s model. (c) Shafer’s model.

212

A.-S. Capelle et al. / Information Fusion 5 (2004) 203–216

5.3. Contribution of spatial information This part deals with the analysis of the results obtained when spatial information are included in the segmentation process. We answer diﬀerent questions. Which modiﬁcations are brought on the ﬁnal segmentation (Section 5.3.1)? How are the diﬀerent behaviors of the decision rules inﬂuenced by such information (Section 5.3.2)? What is the behavior of the conﬂicting information (Section 5.3.3)? 5.3.1. Segmentation results This section deals with the comparison between the segmentation obtained with and without the use of neighborhood information. We will see in particular how these additional information improve the segmentation and how it can solve some ambiguous situations. Fig. 5(a) shows the segmentation obtained with the introduction of spatial information using Shafer’s model. The mass of each neighbor is attenuated by a factor which takes into account the distance to the voxel to classify and the dimensions of the voxels (Eq. (37)). Whereas the quality of the segmentation obtained with Shafer’s model without spatial information is already acceptable (Fig. 2(b)), we can notice some changes. Comparing Figs. 5(b) and (c), which are a zoom of the segmentations obtained respectively without and with spatial information, we observe that a lot of isolated points of the initial segmentation disappear when one introduces spatial information. Moreover, the regions are smoother as we can see on the boundaries of the œdema which are more regular. Figs. 6(a) and (b) show the segmentations obtained using Shafer’s model with a pignistic decision rule, with and without the introduction of spatial information.

Fig. 5. Eﬀects of the introduction of spatial information. (a) Segmentation result with Shafer’s model, spatial information and pignistic decision on the same slice as Fig. 2(b). On the right, (b) and (c) are respectively a zoom without and with spatial information.

Fig. 6. Segmentation results with Shafer’s model. The rejection rate is equal to 0.7. In white, the rejected voxels. (a) Without spatial information. (b) With spatial information.

The white points represent the voxels rejected by the classiﬁcation. The rejection rate equals 0.7. Without spatial information (Fig. 6(a)), the rejected points are located both on the boundaries of the regions and inside the regions. The ﬁrst kind of rejection is an ambiguity rejection: the point is close to several regions. The second kind of rejection is a distance rejection: the point is too far from at least one region, thus it could be a noisy point or a point which belongs to any class of the learning set. With the use of spatial information (Fig. 6(b)), the rejected points are mainly located on the boundaries and are less numerous. It seems that the spatial information combinations eliminate most of the noisy points and reduce the ambiguity on the boundaries. The segmentation is thus more accurate by using spatial information. The boundaries are smoothed, while keeping the ﬁnest details. Theses results are obtained thanks to the addition of new information and the use of discounting factors which tune the contribution of the diﬀerent voxels of the neighborhood. The neighbors conﬁrm or invalidate the previous knowledge: some ambiguous or noisy points are re-classiﬁed: this is a spatially guided belief revision process. The discounting factor allow to ﬁnely smooth the regions and to keep the details. Without the use of discounting factors, the segmentation would be coarser and of poorer quality. 5.3.2. Modiﬁcation of the decision plots As explained in Section 2.4, diﬀerent decision functions can be used with evidence theory. In this part, we are interested in the modiﬁcations of the decision plots introduced by the use of spatial neighborhood information. For that purpose, we study the variations of the rejection rate when the rejection cost Cr varies from 0 to 1. Fig. 7 represents the results obtained with and without spatial information for Appriou’s model. On Fig. 7(a), we retrieve the properties of credibility, pignistic

A.-S. Capelle et al. / Information Fusion 5 (2004) 203–216

213

Fig. 7. Evolution of the number of the rejection points according to the rejection threshold for Appriou’s model. (a) Without spatial information. (b) With spatial information.

and plausibility decision rule: for all BelðX Þ 6 BetP ðX Þ 6 PlðX Þ for all X 2 X. With the introduction of spatial information, we see on Fig. 7(b) that the three plots tend to converge asymptotically. It becomes almost equivalent to choose a pessimistic or an optimistic decision. Moreover, the level of the rejection rate is globally less important using spatial information. The same behavior is observed in the case of Denœux and Shafer models. 5.3.3. Conﬂicting information This section deals with the origin and the interpretation of the conﬂicting information. Mechanically, the conﬂicting information is the result of the combination of masses of belief and reﬂects how the sources are opposed. If the conﬂict is maximum (k ¼ 1), the sources are in complete opposition. On the contrary, when k ¼ 0, the sources completely agree (Section 2.3). However, the interpretation and the usage of the conﬂict is not obvious. Three main reasons are basically brought to interpret the conﬂicting information [26]. The ﬁrst one deals with the quality of the sources: some abnormal measurements induce conﬂicting mass during the combination. The second reason is linked to the modeling of the masses of belief: the use of a nonappropriate or an imprecise model induces some variations in the belief function and provides conﬂict. The last reason relies on the number of sources to combine: the increase of the number of sources provides the increase of the conﬂict amount. In our application context, two main processes induce the conﬂicting information. On one side, the conﬂict can be attributed to the fact that models are imprecise. Even if the learning step is eﬃcient, there is a vagueness leading to the corruption of the bbas. Their combination thus induces conﬂict. The conﬂict is then inherent to the training. The less eﬃcient is the training, higher is the conﬂict. The conﬂict corrupts the belief functions and decreases the classiﬁcation eﬃciency. Thus, it is an undesirable information. On the other side, some conﬂict appears when we combine the neighboring bbas (Section 4.4). This information is now an expected information. Indeed, if we consider that the diﬀerent regions of an image are homogeneous, the combination of neighbors near the frontiers of the regions induces conﬂicting in-

formation which reﬂects the frontiers. On Fig. 8 are represented the conﬂict generated by the spatial combination for the three models. On each images, a high conﬂict is expressed by a high grey level and the absence of conﬂict corresponds to black points. We observe that the conﬂict is especially located on the boundaries of the regions. Inside a region, i.e. in homogeneous regions of the volume, the information brought by the diﬀerent neighbors agree and consolidate the previous knowledge: the generated conﬂict is low. On the contrary, near the boundaries, the believes associated to each of the neighbors are opposed. One part of neighbors supports one hypothesis whereas the other part supports another hypothesis, thus creating high conﬂict. However, we should make the diﬀerence between Appriou’s conﬂict and the two others. Indeed, the Appriou’s model is a multi-sensor one. Each echo is a sensor which has its personal interpretation of the scene. In particular, the tumor is not represented in the same way with the used echoes which naturally induces conﬂict. Thus, in Fig. 8(middle), the conﬂict is high not only on the frontiers but on the tumor too and that complicates the interpretation of this information. The interpretation and the use of the conﬂicting information is complex because it is impossible to separate the quantity of conﬂict due to the vagueness of the training and the one due to local information about the frontiers. However, we can suppose that this last interpretation of the conﬂict corresponds to the dominant case. Thus, we can say that the local conﬂict maxima correspond to the existence of a frontier and indicate that the decision about the class membership is not

Fig. 8. Conﬂicting information generated by the combination of 26 neighbors (from left to right Denœux, Appriou and Shafer models). A high conﬂict is expressed by a high grey level and the absence of conﬂict corresponds to black points.

214

A.-S. Capelle et al. / Information Fusion 5 (2004) 203–216

obvious. Thus, we propose to use the conﬂicting information as an image of the conﬁdence about the segmentation process decisions. Let cðsÞ be the intensity of the conﬂict located in the voxel s. Let us deﬁne cconf ðsÞ the conﬁdence on the segmentation in s by 0 if cðsÞ < kc ; cconf ðsÞ ¼ ð38Þ cðsÞ if cðsÞ P kc ; where kc is a threshold constrained by 0 6 kc 6 1. Experiments show that a threshold kc ¼ lc , where lc is the mean conﬂict of the whole volume, deletes most of the conﬂict located inside the regions while keeping the frontier information as show (Fig. 9(a)). A particular case appears with Appriou’s model. Some conﬂicting points located in the tumor and œdema are kept on the conﬁdence image. This behavior is due to the multisensor aspect of the model which highlights the complementarity between the used echoes. However, it still possible to extract the conﬂict frontiers using a higher threshold. Fig. 9(b) represents the conﬁdence image with kc ¼ lc þ 0:25rc where rc is the standard deviation of the conﬂict. Thus, the frontiers between œdema and tumor appear. Globally, the conﬂict information match with the frontiers between the main anatomical structures (WM, GM, CSF). However, this information diﬀers for the tumor and œdema. Denœux’s model mainly highlights the tumor frontiers. Shafer’s model mainly highlights the œdema frontiers. Finally, Appriou’s model both

Fig. 9. Conﬁdence images obtained using Eq. (38). (a) kc ¼ lc ; (b) kc ¼ lc þ 0:25rc .

highlights œdema and tumor frontiers. Thus, the three models seem to provide complementary information. Fig. 10(a) and (b) represent the conﬁdence images superposed to the original T1 Gado slice. As one can see, the conﬂict is really located at the interfaces of the different anatomical structures and is really signiﬁcant of the frontiers location. Within the context of the aid to

Fig. 10. Conﬁdence images superposed to the original T1 Gado slice. (a) kc ¼ lc ; (b) kc ¼ lc þ 0:25rc .

A.-S. Capelle et al. / Information Fusion 5 (2004) 203–216

decision making, the observation of such an image can help the user (e.g. the physicians) to adjust his belief and his decision.

6. Conclusion In this paper, we have presented an original scheme for the segmentation of multi-echo MR images of brain for the detection of brain tumors. This scheme combines the modeling of the knowledge by means of evidence theory and integrates the spatial neighborhood information. The modeling via evidence theory allows to take into account the characteristics of the multi-echo MR images: complementarity, redundancy and incompleteness. Moreover, evidence theory brings the theoretical support and tools for the combination of such information. The introduction of spatial information (introduction of the spatial dependency) allows us to obtain an image segmentation process. Our work focuses on the analysis of diﬀerent evidential modeling and on the inﬂuence of the introduction of the spatial information. The study of diﬀerent evidential models shows their eﬀectiveness to segment MR images. Thanks to the introduction of neighborhood information, the evidence about the membership class of each voxel of the volume grows and the ﬁnal decision is more accurate. The eﬀects of using neighborhood information were studied according to diﬀerent parameters. In particular, we have shown that the neighborhood information provides a more accurate segmentation: the ﬁne regions are better detected and the boundaries of the diﬀerent regions are smoothed while preserving the small details. The process takes into account the local information and leads to a true regionbased segmentation. By eliminating the ambiguities produced by noisy data and boundary points, the use of spatial information modiﬁes the decision plots. Taking a pessimistic decision is quite equivalent to take an optimistic decision. The remaining ambiguities are mainly located on the boundaries of the regions. Finally, the comparison of the detected tumor volumes with segmentation manually made by a physician highlights the ability of our evidential segmentation scheme to segment pathological MR images. In particular, among the three models that we have been studied, the one proposed by Denœux, which detects more than 96% of the tumor volume, is the most accurate. In a second part of our study, we are interested in the conﬂicting information created by the combination of spatial information. The location of the conﬂicting points conﬁrms the previous observations, being especially concentrated on the boundaries of the regions within the images. We show that the conﬂicting information, that we can quantify by evidence theory, is not only a consequence of data fusion but it is also a source

215

of information about the spatial organization of the data: the conﬂict is a signiﬁcant information that we interpret in the segmentation context. In our case, it is signiﬁcant of the position of the boundaries of the regions and that brings a new source of information to the specialist to soften his decision. This study showed that evidential theory using neighborhood information can enhance segmentation results of multi-echo images. We currently continue our investigations on the analysis of the conﬂicting information and its more complete integration in the segmentation process.

Acknowledgements The authors thank J.C. Ferrie of the regional university hospital of Poitiers to have furnished the used MRI in this study and for the validation of the results of segmentation.

References [1] A. Appriou, Probabilites et incertitudes en fusion de donnees multi-senseurs, Revue Scientiﬁque et Technique de la Defense 11 (1991) 27–40. [2] I. Bloch, Incertitude, imprecision et additivite en fusion de donnees: point de vue historique, Traitement du Signal 13 (4) (1996) 267–288. [3] D. Dubois, H. Prade, Representation and combination of uncertainty with belief functions and possibility measures, Computational Intelligence 4 (1998) 244–264. [4] A. Appriou, Multisensor signal processing in the framework of the theory of evidence, NATO/RTO – Lecture Series 216 on Application of Mathematical Signal Processing Techniques to Mission Systems, November 1999, pp. 5–31. [5] A. Dempster, Upper and lower probabilities induced by multivalued mapping, Annals of Mathematical Statistics 38 (1967) 325– 339. [6] R.H. Lee, R. Leahy, Multi-spectral classiﬁcation of MR images using sensor fusion approaches, in: SPIE Medical Imaging IV: Image Processing 1233, 1990, pp. 149–157. [7] S.Y. Chen. W.C. Lin, C.T. Chen, Evidential reasoning based on Dempster–Shafer theory and its application to medical image analysis, in: SPIE 2032, 1993, pp. 35–46. [8] D.Y. Suh, R.M. Mersereau, R.L. Eisner, R.I. Pettigrew, Automatic boundary detection on cardiac magnetic resonance image sequences for four dimensional visualisation of the left ventricle, in: First conference on Visualization in Biomedical Computing, Atlanta, 1990, pp. 149–156. [9] I. Bloch, Some aspects of Dempster–Shafer evidence theory for classiﬁcation of multi-modality medical images taking partial volume into account, Pattern Recognition Letters 17 (1996) 905– 919. [10] G. Shafer, A Mathematical Theory of Evidence, Princetown University Press, Princetown, NJ, 1976. [11] P. Smets, What is Dempster–Shafer’s model? in: R.R. Yager, M. Fedrizzi, J. Kacprzyk (Eds.), Advances in the Dempster–Shafer Theory of Evidence, New York, pp. 5–34. [12] P. Smets, R. Kennes, The transferable belief model, Artiﬁcial Intelligence 66 (2) (1994) 191–234.

216

A.-S. Capelle et al. / Information Fusion 5 (2004) 203–216

[13] P. Smets, Pratical uses of belief functions, in: Proceedings of the Fifteenth Conference on Uncertainty in Artiﬁcial Intelligence, San Francisco, 1999, pp. 612–621. [14] P. Smets, The combination of evidence in the transferable belief model, IEEE Transactions on Pattern Analysis and Machine Intelligence 12 (5) (1990) 447–458. [15] R. Kennes, Computational aspect of the M€ obius Transformation of graphs, IEEE Transactions on Systems, Man and Cybernetics 22 (1992) 201–223. [16] L.M. Zouhal, T. Denœux, An adaptative k-NN rule based on Dempster–Shafer theory, in: 6th International Conference on Computer Analysis of Images and Pattern, September 1995, pp. 310–317. [17] L.M. Zouhal, T. Denœux, An evidence-theoretic K-NN rule with parameter optimization, IEEE Transactions on Systems, Man and Cybernetics 28 (1998) 263–271. [18] T. Denœux, A neural network classiﬁer based on Dempster– Shafer theory, IEEE Transactions on Systems, Man and Cybernetics 30 (2) (2000) 131–150. [19] P. Walley, Belief function representations of statistical evidence, The Annals of Statistics 15 (4) (1987) 1439–1465. [20] P. Smets, R. Kruse, The transferable belief model for belief representation, in: A. Motro, P. Smets (Eds.), Uncertainty Management in Information Systems: From Needs to Solutions, Kluwer, Boston, 1997. [21] D. Dubois, H. Prade, On the unicity of Dempster rule of combination, International Journal of Intelligent System (1996) 133–142. [22] F. Voorbraak, On the justiﬁcation of Dempster’s rule of combinations, Artiﬁcial Intelligence (1991) 171–197. [23] P. Smets, Resolving misunderstandings about belief functions, International Journal of Approximate Reasoning 6 (1992) 321–344. [24] L.A. Zadeh, On the validity of Dempster’s rule of combination of evidence, University of California, Berkeley, 1979, ERL Memo M79/24. [25] R.R. Yager, On the Dempster–Shafer framework and new combination rules, Information Sciences 41 (1987) 93–137. [26] E. Lefevre, O. Colot, P. Vannoorenberghe, Belief function combination and conﬂict management, Information Fusion (2002) 149– 162. [27] P. Smets, in: M. Henrion, R.D. Schachter, L.N. Kanal, J.F. Lemmers (Eds.), Constructing the Pignistic Probability Function in a Context of Uncertainty, Amsterdam, North-Holland, 1990.

[28] T. Denœux, Analysis of evidence-theoretic decision rules for pattern classiﬁcation, Pattern Recognition 30 (7) (1997) 1095– 1107. [29] C.K. Chow, On optimum recognition error and reject tradeoﬀ, IEEE Transactions on Information Theory 16 (1970) 41–46. [30] B. Dubuisson, M. Masson, A statistical decision rule with incomplete knowledge about classes, Pattern Recognition 26 (1) (1993) 155–165. [31] T. Denœux, An evidence-theoric neural network classiﬁer, IEEE International-Conference on Systems, Man and Cybernetics 3 (1995) 712–717. [32] E. Lefevre, O. Colot, P. Vannoorenberghe, D. de Brucq, Contribution des mesures d’information a la modelisation credibiliste de connaissances, Traitement du Signal 17 (2) (2000) 1–11. [33] S. Mathevet, L. Trassoudaine, P. Checchin, J. Alizon, Application de la theorie de l’evidence a la combinaison de segmentations en region, in: GRETSI, Vannes, France, 1999, pp. 635–639. [34] P. Vannoorenberghe, T. Denœux, Likelihood-based vs distancebased evidential classiﬁers, in: FUZZ-IEEE’2001, Melbourne. Australia, December 2001. [35] P. Smets, Belief functions: The disjunctive rule of combination and the Generalized Bayesian Theorem, International Journal of Approximate Reasoning 9 (1993). [36] L.P. Clarke, R.P. Velthuizen, S. Phuphanich, J.D. Schellenberg, J.A. Arrington, M. Silberger, MRI: stability of three supervised segmentation techniques, Magnetic Resonance Imaging 11 (1993) 95–106. [37] G. Gerig, J. Martin, R. Kinikis, O. Kubler, M. Shenton, F.A. Jolesz, Unsupervised tissue type segmentation of 3D dual-echo MR head data, Image and Vision Computing 10 (1992) 349–360. [38] J.R. Mitchell, S.J. Kalik, D.H. Lee, A. Fenster, Computer-assisted identiﬁcation and quantiﬁcation of multiple sclerosis lesions in MR imaging volumes in the brain, Journal of Magnetic Resonance Imaging 4 (1994) 197–208. [39] A. Dempster, N. Laird, D. Rubin, Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society 39 (1977) 1–38. [40] A.-S. Capelle, O. Alata, C. Fernandez, S. Lefevre, J.-C. Ferrie, Unsupervised segmentation for automatic detection of brain tumors in MRI, in: IEEE International Conference on Image Processing, ICIP2000, vol. 1, Vancouver, Canada, 10–13 September 2000, pp. 613–616.

Evidential segmentation scheme of multi-echo MR images for the detection of brain tumors using neighborhood information

Evidential segmentation scheme of multi-echo MR images for the detection of brain tumors using neighborhood information

Recommend Documents