Segmentation of brain tumors in 4D MR images using the hidden Markov model

Segmentation of brain tumors in 4D MR images using the hidden Markov model

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 4 ( 2 0 0 6 ) 76–85 journal homepage: www.intl.elsevierhealth.com/jou...

1MB Sizes 0 Downloads 60 Views

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 4 ( 2 0 0 6 ) 76–85

journal homepage: www.intl.elsevierhealth.com/journals/cmpb

Segmentation of brain tumors in 4D MR images using the hidden Markov model Jeffrey Solomon a,∗ , John A. Butman b , Arun Sood c a b c

Medical Numerics Inc., Sterling, VA, USA Department of Radiology, National Institutes of Health, Bethesda, MD, USA Center for Image Analysis, George Mason University, Fairfax, VA, USA

a r t i c l e

i n f o

a b s t r a c t

Article history:

Tumor size is an objective measure that is used to evaluate the effectiveness of anticancer

Received 11 July 2006

agents. Responses to therapy are categorized as complete response, partial response, stable

Received in revised form

disease and progressive disease. Implicit in this scheme is the change in the tumor over time;

10 September 2006

however, most tumor segmentation algorithms do not use temporal information. Here we

Accepted 10 September 2006

introduce an automated method using probabilistic reasoning over both space and time to segment brain tumors from 4D spatio-temporal MRI data. The 3D expectation-maximization

Keywords:

method is extended using the hidden Markov model to infer tumor classification based on

Brain tumor

previous and subsequent segmentation results. Spatial coherence via a Markov Random

Expectation-maximization

Field was included in the 3D spatial model.

Hidden Markov model Software

Simulated images as well as patient images from three independent sources were used to validate this method. The sensitivity and specificity of tumor segmentation using this spatio-temporal model is improved over commonly used spatial or temporal models alone. © 2006 Elsevier Ireland Ltd. All rights reserved.

1.

Introduction

The most accepted objective response to anticancer agents is a reduction in tumor size. For brain tumors, contrast enhanced MRI can delineate the extent of tumor, and is used routinely to diagnose, stage and track the progression of brain tumors [1]. For both patient care as well as for clinical research trials, it is standard to measure the size of the tumor at baseline, and to perform serial MRIs to determine whether the tumor decreases in size (responds to therapy) or increases in size (progresses). Therefore, accurate determination of response vs. progression is essential. Currently, clinical methods of tracking progression emphasize ease of use in clinical practice. The RECIST criteria standardize simple linear measurements of the greatest diameter. Some approaches multiply the greatest diameter by the greatest perpendicular diameter as seen on a chosen MR section [2,3], despite the fact that most tumors

are not elliptical (e.g. Fig. 1). Because of variations in head positioning in the scanner from trial to trial, measurements based on a single 2D measurement are not necessarily reproducible. Planimetry is a manual method of tumor delineation where all MRI sections containing the tumor are traced and summed to achieve a volumetric measure. While manual tracing of the tumor borders have been shown to be more accurate than diameter measurements [4], this technique is not used in clinical practice due to the large time involved in making these measurements. As the spatial resolution of MRI systems increases, yielding more sections to analyze, manual tracing will be even less practical. These factors, as well as ongoing research requiring the classification of tissues in the brain (gray matter, white matter and cerebral spinal fluid), have led to a number of research articles focusing on automatic brain tissue segmentation. Recent methods have focused on EM with a Gaussian Mixture model

∗ Corresponding author at: National Institutes of Health Bldg. 10, Room 1C543 10 Center Drive Bethesda, MD 20854, USA. Tel.: +1 301 435 9355; fax: +1 301 480 9774. E-mail address: [email protected] (J. Solomon). 0169-2607/$ – see front matter © 2006 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.cmpb.2006.09.007

77

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 4 ( 2 0 0 6 ) 76–85

using Eq. (1). P(Ci |yj ) =

P(yj |Ci )P(Ci )

(1)

P(yj )

Here Ci represents the ith tissue class for pixel j, whose intensity is yj . P(Ci ) is the prior probability of tissue class, Ci being in the image and P(yj ) is given by the Gaussian mixture model in Eq. (2).

P(yj ) =

k 

P(Ci )P(yj |Ci )

(2)

i=1

Since P(yi ) is the same for all possible tissue classes, Ci for voxel j, P(yi ) serves as a normalization constant, making the sum of all tissue class probabilities for a given voxel equal to 1. The conditional probability, P(yj |Ci ) represents the probability of having voxel intensity yj given that it is from class Ci . The distribution of voxel intensities for a given tissue class is assumed to be Gaussian with mean i and standard deviation  i . Therefore, given a tissue class, Ci , we can estimate the conditional probability from this Gaussian distribution as seen in Eq. (3). Fig. 1 – A T1-weighted, gadolinium enhanced MRI illustrating an irregularly shaped brain tumor.

[5,6]. A smaller number of articles focus on change detection of lesions (multiple sclerosis lesions, brain tumors) [7–9] using image subtraction or optic flow methods. There are two major flaws with using the reported temporal technique to segment brain lesions. First, detection requires change in shape or intensity. Therefore, if there is minimal change in the size of the tumor from one time to the next, change detection schemes will not detect the lesion. Second, with subtraction or optic flow methods, intensity change and shape change can be confounded since these two measures are indistinguishable. The primary goal of this paper is to report a method which uses best of class 3D spatial as well as novel temporal techniques to segment brain tumors in MRIs. This combined approach will allow for the best segmentation of tumors in each time point and will make use of previous and subsequent segmentation results in order to optimize the segmentation. This 4D segmentation method is an improvement over previous 3D spatial or change detection techniques as it does not require significant change in tumor size for detection. In addition, false positives from iso-intense normal tissue are greatly reduced.

 P(yj |Ci ) =



1



2i

 exp

−(yj − i ) 2i2

2

 (3)

Typically, the Gaussian parameters (,) are unknown and must be estimated. Each voxel is segmented into the appropriate tissue class, based on knowledge of these estimated parameters. In cases of “missing data”, where both the Gaussian parameters and the tissue classes for each voxel are unknown, the EM algorithm has been widely used. The algorithm is iterative and is described here: (1) Initialize parameters of Gaussians (means, standard deviations, amplitudes) by setting them arbitrarily (or use prior knowledge if available). (2) The expectation step computes the probability of pixel j being in tissue class Ci , given as P(Ci |yj ): P(Ci |yj ) = ˛P(yj |Ci )P(Ci )

(4)

where ˛ is a normalization constant. (3) Maximization step is to compute the updated mean, standard deviation and class amplitudes based on E-step above. ni =



P(Ci |yj )

(5)

j

2. 2.1.

Computational methods and theory 3D spatial segmentation

A popular class of 3D brain segmentation methods uses Bayesian techniques [5,6] in which the posterior probability of the voxel’s tissue class given the voxel intensity is estimated

i =

 P(Ci |yj )yj

(6)

ni

j

  2  P(Ci |yj )(yj − i ) i = j

ni

(7)

78

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 4 ( 2 0 0 6 ) 76–85

P(Ci ) =

ni [total number voxels]

(8)

where ni is the number of voxels in class Ci . (4) Iterate between steps (2) and (3) until convergence.

2.2.

Additional modeling can be included into the EM context to improve the segmentation process. The EM algorithm described above is solely based on voxel intensities and there are no spatial constraints. The Markov Random Field implemented as a Gibbs distribution has been used to model spatial coherence among neighboring voxels [10]. To integrate this model into the EM algorithm, the Gibbs distribution P(Ci ) =

1 −U(Ci )/T e Z

(9)

is substituted for P(Ci ) in Eq. (1) as a prior probability of the segmentation. In this case, P(Ci ) no longer represents the percentage of tissue class Ci in the image. In Eq. (9), P(Ci ) is the prior probability of classification Ci , Z and T are constants, U(Ci ) is the Gibbs potential given by:



4 V (m, n) + n=1 xy

U(Ci ) =

2

2 l=1



Vz (m, l)

(10)

Vxy represents the in-plane clique potential with m being the current voxel and n representing its clique neighbors. Vz represents the out of plane clique potential with l representing the clique neighbors one MRI section above and one MRI section below. The divisor of 2 is due to duplication of cliques in the image matrix. The values of Vxy and Vz are given below:

Vxy =

Vz =

−1,

m=n

+1,

m = n

−2

m=l

+2

m = l

The values of ±2 for Vz were chosen so that spatial coherence of out of plane neighbors would have less effect than in plane neighbors.

Hidden Markov model

For cases of tracking the segmentation of brain tissue over time in a given subject, the 3D spatial segmentation process is repeated independently for each time point [11]. We hypothesize that improvement can be gained by utilizing segmentation results from previous and subsequent 3D segmentations when estimating the tissue classes for the current scan. This paper introduces a novel temporal method which uses probabilistic reasoning over time combined with probabilistic reasoning over space to segment 4D brain MRIs. The temporal model is a Dynamic Bayesian Network, specifically the hidden Markov model, with image based transition and sensor models. We first considered the state of each voxel in our 4D medical image data to be lesion status, with possible values of lesion and non-lesion. The problem of segmenting and tracking lesions over time can be modeled as a HMM. The state of the system (lesion/non-lesion), represented in this model as Xt , is not observable. What is observable is the MR signal intensity at each voxel, Et . A 1st order Markov assumption (the current state can be determined completely from the previous state) was made. Eq. (13) represents this 1st order Markov state to state “transition model”. P(Xt |Xt−1 ) = P(Xt |X0:t−1 )

transition model

(13)

The evidence variable Et is dependent only on the current state. This dependence is known as the sensor model.

(11)

P(Et |Xt ) = P(Et |X0:t−1 )

(12)

Each voxel in the 4D image will independently follow the HMM. Spatial coherence among neighborhood voxels is modeled with Markov Random Fields.

sensor model

(14)

Fig. 2 – Dynamic Bayesian Network describing the state of voxel i in terms of its tissue class (lesion/non-lesion). The tissue class for voxel i at time t depends on its class at time t − 1 via a simple transition matrix. The voxel intensity Ei at time t, depends on the tissue class at time t via a sensor model.

79

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 4 ( 2 0 0 6 ) 76–85

This model is shown graphically in Fig. 2. In this simplistic case, if the voxel was classified as a lesion (e.g. tumor) at time t − 1, the probability that is will be a tumor at time t is 0.9. The probability of the voxel having intensity Ei given that it is classified as tumor is 0.7. Improved transition and sensor models are developed below as a significant improvement over these simplistic models. In a HMM process known as filtering, one computes the posterior distribution over the current state given all evidence to date P(Xt |E1:t ). In our case, if the state Xt equals lesion or nonlesion and Et is equal to the pixel’s intensity, then P(Xt |E1:t ) provides the tissue classification given the pixel’s intensity values. Using Bayes’ rule, a recursive formula is developed to estimate a pixel’s state at time t + 1 as shown in Eq. (15).



P(Xt+1 |E1:t+1 ) = ˛P(Et+1 |Xt+1 )

lesion border, so that the distance values for the transition model can be applied on the fly. At each step in time, a new distance map is calculated, based on the lesion estimate for that image time point. The transition model described here requires an estimate of the lesion border on the first scan in the series. It is possible to create this estimate via the EM algorithm and some manual correction or by manual tracing alone. The sensor model used is given in Eqs. (19) and (20):

 P(Et |Xt (L)) = Al

P(Et |Xt (NL)) =

P(Xt+1 |Xt )P(Xt |E1:t )

(15)

The term ˛ represents a normalizing constant which sums probabilities to 1 [12]. The current state is projected forward from t to t + 1 using the transition matrix and then updated using the sensor model with the new evidence Et+1 . Another HMM process known as smoothing is used to compute past states given evidence up to the present. In other words, it is possible to re-estimate the state at time points 1–4 given that you have estimates up to time point 5. This will theoretically improve the estimates of these states by using more information gathered. The following expression represents the probability of state Xk (1 ≤ k < t), given all evidence acquired so far. P(Xk |E1:t ) = ˛P(Xk |E1:k )P(Ek+1:t |Xk )

(16)

The right side of this equation contains the filtering expression from Eq. (15) and a backward expression, P(Ek+1:t |Xk ). A recursive expression for this backward expression is provided in Eq. (17), P(Ek+1 |xk+1 )P(Ek+2:t |xk+1 )P(xk+1 |Xk )

(17)

where the recursion is in the term P(Ek+2 |xk+1 ).

2.2.1.

Transition and sensor models

A transition model was developed that is dependent on the distance from the lesion border. The biological significance of this is that the likelihood of a transition from lesion to nonlesion or vice versa is greater when the voxel is close to the lesion border (margin). This new model is shown in Eq. (18).

 T = P(Xt |Xt−1 ) =

 √

1 − exp(−dist/C)

exp(−dist/C)

exp(−dist/C)

1 − exp(−dist/C)

 (18)

The coefficient, C is a constant, and dist represents the distance from the border of the lesion. The optimum value of C was determined empirically to be 3. This transition model is therefore unique for each voxel in the image and varies with each volume in the time series. For convenience, a distance map image can be pre-computed, based on the estimated



 exp

Ai 2i exp

−(y − l )

2

 (19)

2l2



−(y − i ) 2i2

2

 (20)

where the means and standard deviations of the tissue classes are derived using the EM algorithm. The probability at time t of having a pixel value equal to Et given that this pixel represents lesion (state Xt (L)) is proportional to the Gaussian of the lesion tissue class l and the weighting term Ai . The EM algorithm derives the Ai term which represents the Gibbs distribution from Eq. (9) for tissue class i. The Gibbs distribution is part of the MRF model and weights the probability of having a voxel intensity of Et given the voxel state Xt by the state of the voxel’s neighbors (i.e. whether they have been classified as lesion or not). The probability of having the pixel value Et given that the pixel does not represent lesion (state Xt (NL)) is proportional to the sum of the other Gaussians in the mixture model. There is a coefficient, ˛, used to sum the probabilities to 1 as seen in Eq. (21).

˛



√ 2l

i

xt

P(Ek+1:t |Xk ) =

1

1 

P(Et |Xt (i)) = 1

(21)

i=NL,L

These transition and sensor models were used to apply Eqs. (15) and (16) and estimate whether a voxel is lesion or not based on the evidence of all brain scans. By combining the EM and HMM algorithms into a 4D spatiotemporal technique, one may segment and track lesions over time. This should theoretically show improvement over temporal methods that can only detect lesion change and spatial methods that do not make use of prior segmentations of the same object.

2.3.

Refinement of transition model

A refinement was made to the transition matrix to support multiple tissue classes. Recall that the transition matrix P(Xt+1 |Xt ) represents the probability of a voxel’s state at time t + 1 given its state at time t. In this two state transition matrix, the possible states of a voxel are lesion and non-lesion. We can benefit from classifying voxels in the image according to all of the tissue types. For these MR brain images, the possible tissue classes are (1) cerebral spinal fluid (CSF), (2) normal parenchyma (gray and white matter), (3) tumor and (4) blood vessels. It is hypothesized that by introducing a multi-tissue class transition matrix, fewer false positives will occur if voxels have the potential to be classified as blood vessels.

80

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 4 ( 2 0 0 6 ) 76–85

Table 1 – Transition matrix including four tissue types

Tumort Parenchymat Vesselt CSFt

Tumort+1

Parenchymat+1

Vesselt+1

CSFt+1

1 − exp(−dist/C) (1/3)exp(−dist/C) (1/3)exp(−dist/C) (1/3)exp(−dist/C)

(1/3)exp(−dist/C) 1 − exp(dist/C) (1/3)exp(−dist/C) (1/3)exp(−dist/C)

(1/3)exp(−dist/C) (1/3)exp(−dist/C) 1 − exp(dist/C) (1/3)exp(−dist/C)

(1/3)exp(−dist/C) (1/3)exp(−dist/C) (1/3)exp(−dist/C) 1 − exp(dist/C)

The matrix cells represent the probability of a voxel’s state being a particular tissue class at time t + 1 given the state at time t. As the distance from the tumor border increases, the likelihood of transitioning between tissue classes decreases.

The transition matrix of Eq. (18) was modified to include other tissue classes as seen in Table 1. The distance from the lesion border is the basis of these transition probabilities. This is because the closer a voxel is to the tumor border; the more likely there will be a “mass effect” or movement of tissue, causing a transition from the voxel’s current tissue class to another. The temporal ISBR image data and the image data acquired at the NIH were re-analyzed with this model.

3.

Imaging data

3.1.

Simulated image data

Simulated image data was generated in order to validate our novel 4D segmentation techniques. These synthetic images were based on a Gaussian mixture model whose voxel intensities were derived based on three Gaussian distributions. Two of these distributions were used to model two different normal tissues and the other distribution was used to model tumor as well as iso-intense normal tissue. The variance of the tumor intensity distribution is greater than normal tissue as this is the experience in actual MR image data. The iso-intense normal tissue is used to model vessels which have similar intensities to tumor in contrast-enhanced brain MR images. The parameters for these distributions (mean, standard deviation) were obtained from actual MRI lesion segmentation results from previous work [13]. The three volumes form a 4D temporal data set modeling exponential lesion growth by changing only the size of the synthetic tumor, while leaving other tissues stationary. Fig. 3 shows a sample slice from the volumes over time. By knowing the exact size and location of the simulated tumor as it grows, we have ground truth to validate our vari-

ous segmentation results. The measures chosen to report our results were sensitivity measured as percentage of true lesion detected and similarity as reported by the Jaccard measure (intersection of segmentation with ground truth/union of segmentation with ground truth). The range of Jaccard similarity is from 0 to 1.

3.2.

Patient image data

In addition to simulated data, it was desired to validate the method on actual MR brain images. There were three distinct sets of MRI data used for validation. First, it was desired to assess the 3D segmentation aspect of our algorithm. For this, we used the Internet Brain Segmentation Repository (IBSR) data set. Its purpose is to encourage the development and evaluation of segmentation methods by providing raw test and image data, human expert segmentation results, and methods for comparing segmentation results. The 20 normal MR brain data sets and their manual segmentations were provided by the Center for Morphometric Analysis at Massachusetts General Hospital and are available at http://www.cma.mgh.harvard.edu/ibsr/. Other segmentation papers have reported results from this 20 subject MR data set and we wished to compare our segmentation results. The second set of validation data also came from the IBSR. This MRI data was derived from a single subject with a brain tumor who had five scans performed with time intervals of approximately 6 months. This data set had ground truth results derived manually by IBSR staff experts. The goal of this validation was to compare the 3D segmentation (EM) independently run on each time point with the use of the 4D method described here. The third set of data was collected at the National Institutes of Health. This data consisted of five brain tumor subjects each having three scans with an average time interval of 1.7 months. The ground truth for these five subjects was

Fig. 3 – Simulated images representing three normal tissues and an exponentially growing tumor. One of the three normal tissues is iso-intense with the tumor modeling blood vessels in a contrast enhanced MR image.

81

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 4 ( 2 0 0 6 ) 76–85

Table 2 – Three sets of real MRI data were used to validate our method Data category

MR scanner

Voxel size

Acquisition

IBSR normal

1.5 T Siemens 1.5 T General electric

1.0 mm in-plane 3.0 mm slice thickness

T1-weighted Coronal

IBSR temporal

1.5 T General electric

0.9375 mm in-plane 3.1 mm slice thickness

T1-weighted + gadolinium Coronal

NIH

1.5 T General electric

0.9375 mm in-plane 1.0 mm slice thickness

T1-weighted + gadolinium Sagittal

This table describes the image characteristics as well as the source of this image data.

derived by manual tracing, slice by slice by a neuroradiologist (JAB). The parameters of all three MRI data sets are shown in Table 2.

4.

System description

A pipeline approach was used to process the series of MR images for each patient (Fig. 4). Two pre-processing steps were required before running the 4D segmentation method. The first step was image registration. This image alignment was required to track the segmentation of a voxel over time as mis-alignment would cause mixing of voxels in the temporal model of a single voxel. There are many methods of automatic image registration and we used FLIRT [14] with the cross-correlation cost function and sinc interpolation due to its robustness and accuracy. FLIRT was developed by the FMRIB Lab at Oxford, England and is fully automated. The second pre-processing step was to remove all non-brain voxels (scalp fat, head and neck musculature, bone marrow) from the MR image. This process is commonly known as de-skulling. A popular method for de-skulling, know as the Brain Extraction Tool (BET) which was also developed at the FMRIB Lab was evaluated and did not perform well on the brain tumor MR images. These images are acquired with contrast agents making vessels appear hyper-intense. These vessels create bridges between tissues in the brain and the skull, making the automated delineation of the brain more difficult. We developed an approach which uses registration to a standard brain template to de-skull. The process works as follows. The input brain MRI is automatically registered to a standard template brain (ICBM Single Subject MRI Anatomical Template)

using a 12 parameter affine transformation. Once the registration is complete, the transformation matrix is stored. This transformation is applied in the reverse direction on a previously de-skulled and morphologically dilated version of the template brain. Once this de-skulled template is transformed into the space of the original brain image, it is used as a binary mask to de-skull the input brain. A final pre-processing step involved manual tracing of the brain tumor on the first time point only. This step may also be performed semi-automatically by looking at the results of the 3D EM segmentation on this time point and removing any vessels that were segmented as tumor.

4.1.

Software description

The MEDx image analysis and visualization software (Medical Numerics, Inc., Sterling, VA) was used to visualize the segmentation results. This software package provides an API known as ImageScript. ImageScript is an extension of the Tcl/Tk programming languages and provides function calls for image analysis operations as well as a mechanism to execute external compiled programs. ImageScript code was written to execute the FLIRT program for image registration as well as to perform the brain de-skulling described above. The 4D brain tumor segmentation described in Section 2 was implemented in the C programming language and can execute stand-alone or be integrated into MEDx for visualization using ImageScript. This software was compiled for the Linux and Solaris operating systems. The typical processing time for the entire pipeline described in Fig. 4 is less than 15 min on a Dell Computer using a 3.2 GHz Pentium 4 processor with 2 GB of RAM. In this typical scenario, the matrix size of a patient’s MR scan was 256 × 256 × 124 voxels and there were 3–5 MR scans acquired

Fig. 4 – MR images are sent from the scanner to a workstation via the DICOM protocol. The MR images are automatically registered using an automated linear registration program called FLIRT [14]. The brain MRIs are then automatically de-skulled. The 4D data is segmented using probabilistic reasoning over space and time via the EM + HMM method. As part of this technique, the segmentation of the first time point is manually corrected by removing enhancing vasculature incorrectly classified as tumor by the EM algorithm. The remaining volumes in the time series are segmented automatically without user interaction.

82

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 4 ( 2 0 0 6 ) 76–85

over time. The software described in this article may be freely obtained from the authors and used with the commercially available MEDx image processing and analysis software package.

5.

Results

5.1.

Simulated data

The simulated data was analyzed by six methods. These include (1) K-means clustering, (2) EM alone, (3) EM + MRF, (4) EM + HMM + filtering, (5) EM + HMM + smoothing, (6) EM + HMM + MRF + smoothing. Sensitivity and Jaccard similarity metrics were used to measure detection rate and false positive rate. Fig. 5 shows the results of each method on the simulated data. While use of the MRF to model spatial coherence shows a dramatic increase in sensitivity, the similarity measure is not increased by this amount. This is due to the false positives from the simulated iso-intense normal tissue. This iso-intense tissue represents vessels that will enhance along with tumor in the actual MRI data. Incorporating the HMM algorithm even without the MRF increases the similarity measure. Including the MRF and the HMM provides for the highest detection rate and greatest similarity to the ground truth.

5.2.

IBSR data sets (3D and temporal)

In this section, we demonstrate that the 3D static segmentation algorithm included in our method is on par with the best of breed. To accomplish this task, we compared our segmentation results using the 3D EM + MRF algorithm with results obtained previously from two other researchers [15,16]. The common data set used was the 20 normal brain MR images from the Internet Brain Segmentation Repository. As reported by Pham [15], four of these data sets are of such poor quality that manual intervention was required to process. We chose to exclude these four data sets from our validation so that our automated method would be performed in the same manner on all data tested. Jaccard similarity measures for the three methods are shown in Fig. 6.

Fig. 6 – Jaccard similarity measures for two common 3D segmentation methods compared to our 3D method on the IBSR normal brain MRI data set. Results show that our 3D method is as good as best of breed techniques.

Our segmentation achieved results with the same overall accuracy as these two methods. Given this result, we desired to see if our 4D segmentation could improve sensitivity and specificity. The IBSR has one temporal data set that contains five scans of a single subject with a brain tumor. These scans are separated in time by 6 month intervals. Fig. 7 shows the results of our 3D EM + MRF run independently on all 5 time points compared with the full 4D method (EM + MRF + HMM). While sensitivity is nearly the same, there is a dramatic improvement in specificity resulting in a much larger value of the Jaccard similarity measure. The significance of this result is that the segmentation of all subsequent scans is now fully automatic, whereas with a 3D segmentation run independently on each volume, manual intervention to remove isointense tissue classified as tumor is required. Fig. 7 also shows the sensitivity and specificity of a manual slice by slice tracing of the tumor done by an expert radiologist (JAB). The similarity with ground truth is slightly better than our 4D method, but does not agree perfectly with the manual tracing performed by the expert staff at IBSR. These results illustrate some difficulties with obtaining ground truth data for validation.

Fig. 5 – Sensitivity and Jaccard similarity for six segmentation methods tested on the second volume in the 4D synthetic data set. The combined EM + MRF + HMM + smoothing technique shows the best sensitivity and specificity.

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 4 ( 2 0 0 6 ) 76–85

83

Fig. 8 depicts a single slice from each contrast-enhanced MRI volume in the IBSR temporal data set. The segmentation results from the 3D EM algorithm as well as the 4D HMM algorithm are shown. Many of the blood vessels which have similar voxel intensity values to tumor are falsely classified as tumor in the 3D method are correctly classified with the 4D HMM method.

5.3.

Fig. 7 – Average similarity vs. average sensitivity over 5 time points taken from the IBSR temporal data set. Adding the MRF model to the 3D method improves sensitivity but not specificity. This is because blood vessels that have similar intensity to tumor are still misclassified as tumor. Including temporal as well as spatial probabilistic reasoning reduces false positives and greatly improves the specificity of tumor detection, producing results closer to expert manual tracings.

Tracing of tumors for validation

The final results were obtained from analyzing five patients whose scans were acquired at the National Institutes of Health. Fig. 9 shows the comparison of average sensitivity and Jaccard similarity between the 3D methods and our novel 4D method over 10 of the 15 volumes. Since the first time point for each subject was manually traced, these five volumes were excluded from the analysis. As with the IBSR temporal data set, the false positives were greatly reduced with the 4D method, resulting in an improved Jaccard similarity value. For the 4D method, the mean sensitivity was 0.79 ± 0.14 and the mean Jaccard similarity measure was 0.68 ± 0.19. Most of the variability was due to 3 of the 10 volumes. For 7 of the 10 volumes, the mean sensitivity was 0.87 ± 0.07 and the mean Jaccard similarity was 0.78 ± 0.07.

5.4.

Multi-class transition matrix

Sensitivity and Jaccard similarity when using the multi-class transition matrix were compared with the two state transition matrix for the IBSR temporal data set as well as the NIH data set. Table 3 shows the improvement gained in both measures for the IBSR data set by including multiple tissue classes into the transition matrix. There is improvement in similarity (fewer false positives) with the NIH data set but an average decrease in sensitivity.

Fig. 8 – A single slice from each Gad-enhanced MRI scan in the IBSR temporal data set. The 3D and 4D segmentation results are fused onto the MRI in color. The 3D segmentation shows many blood vessels that were falsely classified as tumor due to their iso-intense voxel intensities. The 4D HMM method correctly classifies these blood vessels.

84

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 4 ( 2 0 0 6 ) 76–85

Table 3 – Average sensitivity and Jaccard similarity of multi-class transition matrix compared to the two state transition matrix Data origin

Two state sensitivity

IBSR NIH

0.890 0.804

Multi-class sensitivity 0.918 0.767

Two state similarity 0.664 0.674

Multi-class similarity 0.738 0.688

Fig. 10 shows a single MR section of one of the IBSR temporal time points as well as the subsequent classification into multiple tissue classes. Note how vessels distant from the tumor are correctly classified as vessel instead of tumor.

6.

Discussion

The brain tumor segmentation method outlined in this paper uses probabilistic reasoning in both space and time. Best of breed 3D segmentation is combined with segmentation results from previous and subsequent scans to improve on sensitivity and specificity of results. Previous methods which focus on static or temporal information alone throw out valuable information. Current change detection techniques are not able to detect tumors that have small to no change. On the contrary, our method allows for detection of these tumors via the EM algorithm which is executed on each time point in the temporal series. The false positive rate is greatly reduced by incorporating a distance from tumor border into the transition matrix in the HMM. This point can be visualized in Fig. 11 by considering the segmentation results from the diffuse tumor shown in Fig. 1. While the image intensity of the cavernous

Fig. 10 – (A) A coronal section from one of the temporal scans in the IBSR temporal data set. The various tissue classes are labeled. (B) The resulting segmentation of this section using our 4D technique. The tumor is well segmented and iso-intense vessels distant from the tumor are correctly classified.

Fig. 9 – Average sensitivity and Jaccard similarity of 3D EM, 3D EM + MRF and 4D HMM method vs. expert manual tracing for five subjects scanned at NIH. A marked improvement in Jaccard similarity is seen with the 4D HMM method. For the HMM method, the average sensitivity was 0.79 and the similarity was 0.68.

sinus is as bright as the tumor, it is not mis-classified as tumor because it is distant from the tumor and therefore the transition probability to tumor is small. The gray outline seen around the tumor represents the manual tracing of an expert looking at Fig. 1. There is strong agreement between this manual tracing and the automated segmentation results. While it is required to manually or semi-automatically segment the first time point in the series, all remaining scans are segmented in a fully automatic manner. This is a major impact of this work as this automation reduces the time involved in segmentation and guarantees a reduction in operator variability due to manual exclusion of iso-intense normal tissue, such as blood vessels. Validation was performed on synthetic as well as real MR data. From results of various methods on the synthetic data (Fig. 5) we see the effect of each component of our 4D model. The MRF significantly increases the sensitivity while

c o m p u t e r m e t h o d s a n d p r o g r a m s i n b i o m e d i c i n e 8 4 ( 2 0 0 6 ) 76–85

85

it is known that normal parenchyma will transition to csf over time. One could estimate the volume loss of normal parenchyma and growth of CSF. It is anticipated that this software will be part of a workflow allowing clinicians the ability to track brain tumor volumes over time as a measure of therapeutic response.

references

Fig. 11 – Segmentation result from the image in Fig. 1. The cavernous sinus is bright but not segmented as tumor due to its distance from the tumor.

the HMM greatly improves the specificity of tumor detection. Our method performs as well as other popular techniques on static 3D images as seen in the IBSR normal brain data set and improves in specificity as seen in the IBSR brain tumor subject. The ground truth for these IBSR data sets was independent from our research and allows for an unbiased comparison. As more data is added to this IBSR temporal database, we will continue our validation. The results from the IBSR temporal subject as well as the five brain tumor subjects scanned at the NIH reveal the difficulty of obtaining ground truth measures. In the IBSR temporal results (Fig. 7) we see that manual tracing by an expert neuroradiologist does not result in perfect agreement with the manual tracing performed by the IBSR group. In fact, the agreement is only slightly better than our 4D method. This ground truth problem is also seen in the analysis of our five NIH brain tumor subjects. As noted, in Section 5, 3 of the 10 volumes measured had poorer similarity measures with ground truth. Another trained individual (JS) manually traced these volumes and agreement was accessed between this manual tracing and the expert manual tracings (JAB). The manual tracings of (JS) also had poor agreement (Jaccard similarity of 0.45) indicating that the manual tracing of some of these diffuse and irregularly shaped tumors is difficult. The refined transition matrix which segments the image into their respective tissue classes, provides for tracking change in each tissue class. A potential application includes measurement of atrophy over time in patients with Alzheimer’s disease or multiple sclerosis. For these diseases,

[1] S.M. Haney, P.M. Thompson, T.F. Cloughesy, et al., Tracking tumor growth rates in patients with malignant gliomas: a test of two algorithms, Am J Neuroradiol 22 (2001) 73–82. [2] R. Therasse, S. Arbuck, E.A. Eisenhauer, et al., New guidelines to evaluate the response to treatment in solid tumors, J. Natl. Cancer Inst. 92 (2000) 205–216. [3] A.B. Miller, B. Hoogstraten, M. Staquet, et al., Reporting results of cancer treatment, Cancer 47 (1981) 207–214. [4] G. Sorensen, S. Patel, C. Harmath, et al., Comparison of diameter and perimeter methods for tumor volume calculation, J. Clin. Oncol. 19 (2) (2001) 551–557. [5] K. Van Leemput, F. Maes, D. Vandermeulen, et al., Automatic segmentation of brain tissues and MR bias field correction using a digital brain atlas, in: Proceedings of Medical Image Computing and Computer-Assisted Intervention—MICCAI’98, volume 1496 of Lecture Notes in Computer Science, 1998, pp. 1222–1229. [6] Y. Zhang, M. Brady, S. Smith, Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm, IEEE Trans. Med. Imag. 20 (1) (2001) 45–57. [7] D. Rey, G. Subsol, H. Delingette, et al., Automatic detection and segmentation of evolving processes in 3D medical images: applications to multiple sclerosis, Med. Image Anal. 6 (2002) 163–179. [8] G. Gerig, D. Welti, C. Guttmann, et al., Exploring the discrimination power of the time domain for segmentation and characterization of active lesion in serial MR data, Med. Image Anal. 4 (2000) 31–42. [9] D. Welti, G. Gerig, E.W. Radu, et al., Spatio-temporal segmentation of active multiple sclerosis lesions in serial MRI data, IPMI, LNCS 2082 (2001) 438–445. [10] M.M. Chang, M.I. Sezan, A.M. Tekalp, et al., Bayesian segmentation of multislice brain magnetic resonance imaging using three-dimensional Gibbsian priors, Opt. Eng. 35 (11) (1996) 3206–3221. [11] M. Prastawa, E. Bullitt, N. Moon, et al., Automatic brain tumor segmentation by subject specific modification of atlas priors, Acad. Radiol. 10 (2003) 1341–1348. [12] S. Russel, P. Norvig, Artificial Intelligence: A Modern Approach, second ed., Prentice Hall, Upper Saddle River, NJ 07458, 2002. [13] J. Solomon, A. Sood, 4-D lesion detection using expectation-maximization and hidden Markov model, in: Proceedings of the IEEE International Symposium on Biomedical Imaging (ISBI), 2004, pp. 125–128. [14] M. Jenkinson, S. Smith, A global optimisation method for robust affine registration of brain images, Med. Image Anal. 5 (2) (2001) 143–156. [15] D. Pham, J. Prince, Robust unsupervised tissue classification in MR images, in: Proceedings of the IEEE International Symposium on Biomedical Imaging (ISBI), 2004, pp. 109–112. [16] D.W. Shattuck, S.R. Sandor-Leahy, K.A. Schaper, et al., Magnetic resonance image tissue classification using a partial volume model, Neuroimage 13 (2001) 856–876.